Message from @AntiFish03

Discord ID: 781073303678746625


2020-11-25 08:16:12 UTC  

It's nothing special and I put maybe an hour into it to prove what I was thinking. But it has no place around real data...

2020-11-25 08:17:32 UTC  

Not, that it couldn't be used as a trainer for an AI to look for similar style inconsistencies.

2020-11-25 08:18:02 UTC  

cool

2020-11-25 08:18:13 UTC  

gephi is still forzed

2020-11-25 08:18:32 UTC  

What's your resource monitor say?

2020-11-25 08:19:24 UTC  

it says Gephi is using 1 full core

2020-11-25 08:19:33 UTC  

6 G of RAM

2020-11-25 08:19:49 UTC  

and running very inefficiently w.r.t power 🙂

2020-11-25 08:20:47 UTC  

man that 500k edge graph is taking almost an hour to turn into an svg

2020-11-25 08:21:00 UTC  

OK I need a better way to visualize dot files

2020-11-25 08:21:45 UTC  

Can you do smaller chunks of the data and overlay the dots in layers?

2020-11-25 08:21:46 UTC  

@AntiFish03, you just advanced to level 2!

2020-11-25 08:22:56 UTC  

oh interesting idea

2020-11-25 08:23:06 UTC  

I can probably see how many connected graphs there are

2020-11-25 08:23:59 UTC  

It would let you still visualize the data but at smaller bite size pieces rather than choking on the whole thing at once

2020-11-25 08:25:01 UTC  

Since SVG is just a glorified XML file smaller ones can be merged with others into a bigger one

2020-11-25 08:25:43 UTC  

yea there is no need see the connected components together

2020-11-25 08:25:52 UTC  

I don't even need to merge it

2020-11-25 08:26:15 UTC  

lemme first see how many components there are

2020-11-25 08:26:27 UTC  

or if it is a giant hairball

2020-11-25 08:26:27 UTC  

Well there you go, that should optimize the process considerably.

2020-11-25 08:26:54 UTC  

Unless it's a rats nest

2020-11-25 08:28:33 UTC  

there are 2092 connected components with 5 nodes or more

2020-11-25 08:29:20 UTC  

Let me switch over to my computer so I can actually look at the github repo @DrSammyD posted. My phone chokes on itself to try and open it up

2020-11-25 08:34:53 UTC  

you are only going to make your computer suffer

2020-11-25 08:36:00 UTC  

```
G.number_of_nodes(): 90259
G.number_of_edges(): 313771
len(connected_component_sizes): 31528
len(connected_component_sizes): 2092
connected_component_sizes[:20] [1030, 1022, 996, 865, 762, 622, 610, 583, 493, 464, 421, 397, 384, 371, 347, 331, 330, 325, 315, 287]
Gp.number_of_nodes(): 10955
Gp.number_of_edges(): 164456

```

2020-11-25 08:36:23 UTC  

by picking the top 20 connected components, I got 1/2 the edges

2020-11-25 08:36:42 UTC  

though I do worry that cleaning the data this way might make it seem suspicious when it isn't or vice versa

2020-11-25 08:37:04 UTC  
2020-11-25 08:37:26 UTC  

now I have three fdp operations running to convert dot to svg at the same time

2020-11-25 08:39:49 UTC  

I already pulled my MBP out... so I am looking at the repo directly... just trying to first get a handle on what the code is doing before I try to wrap my head around the deluge of data

2020-11-25 08:40:16 UTC  

Indeed. I have a few things generated

2020-11-25 08:40:35 UTC  

That one is basically the Shiva data on county by county scale per state

2020-11-25 08:43:35 UTC  

I don't think I've ever run my CPU this hot lol

2020-11-25 08:43:43 UTC  

It's a giant cluster of data to try and analyze. I know this might sound weird have you tried doing statistical analysis for mean, median, mode, quartiles and the outliers? It might let you look for out of the ordinary things

2020-11-25 08:45:11 UTC  

OK I found a `neato` option that _drastically_ speeds up generation

2020-11-25 08:45:20 UTC  

I'm really looking at this as a clean set of eyes at the moment, not knowing what has or hasn't been done so not trying to derail anything just asking questions

2020-11-25 08:45:20 UTC  

@AntiFish03, you just advanced to level 3!

2020-11-25 08:46:23 UTC  

there is nothing to derail heh

2020-11-25 08:46:28 UTC  

ask your questoins

2020-11-25 08:47:51 UTC  

(gephi is still forzed lol)