Message from @realz
Discord ID: 781073157162270760
it results in small number of edges
Sadly, I don't have a lot of experience with data analysis... I mean my attempt at try to make an example at how simple it would be to hide code in a vote tabulator for someone is generally as far as I have to go, it's either data input or display of analyzed data.
It's nothing special and I put maybe an hour into it to prove what I was thinking. But it has no place around real data...
Not, that it couldn't be used as a trainer for an AI to look for similar style inconsistencies.
cool
gephi is still forzed
What's your resource monitor say?
it says Gephi is using 1 full core
6 G of RAM
and running very inefficiently w.r.t power 🙂
man that 500k edge graph is taking almost an hour to turn into an svg
OK I need a better way to visualize dot files
Can you do smaller chunks of the data and overlay the dots in layers?
@AntiFish03, you just advanced to level 2!
oh interesting idea
I can probably see how many connected graphs there are
It would let you still visualize the data but at smaller bite size pieces rather than choking on the whole thing at once
Since SVG is just a glorified XML file smaller ones can be merged with others into a bigger one
yea there is no need see the connected components together
lemme first see how many components there are
or if it is a giant hairball
Well there you go, that should optimize the process considerably.
Unless it's a rats nest
there are 2092 connected components with 5 nodes or more
Let me switch over to my computer so I can actually look at the github repo @DrSammyD posted. My phone chokes on itself to try and open it up
you are only going to make your computer suffer
```
G.number_of_nodes(): 90259
G.number_of_edges(): 313771
len(connected_component_sizes): 31528
len(connected_component_sizes): 2092
connected_component_sizes[:20] [1030, 1022, 996, 865, 762, 622, 610, 583, 493, 464, 421, 397, 384, 371, 347, 331, 330, 325, 315, 287]
Gp.number_of_nodes(): 10955
Gp.number_of_edges(): 164456
```
by picking the top 20 connected components, I got 1/2 the edges
though I do worry that cleaning the data this way might make it seem suspicious when it isn't or vice versa
@AntiFish03 This one should run on your phone https://bitcadia.github.io/DownBallot/compare.html
now I have three fdp operations running to convert dot to svg at the same time
I already pulled my MBP out... so I am looking at the repo directly... just trying to first get a handle on what the code is doing before I try to wrap my head around the deluge of data
Indeed. I have a few things generated
That one is basically the Shiva data on county by county scale per state
I don't think I've ever run my CPU this hot lol
It's a giant cluster of data to try and analyze. I know this might sound weird have you tried doing statistical analysis for mean, median, mode, quartiles and the outliers? It might let you look for out of the ordinary things
OK I found a `neato` option that _drastically_ speeds up generation
I'm really looking at this as a clean set of eyes at the moment, not knowing what has or hasn't been done so not trying to derail anything just asking questions
@AntiFish03, you just advanced to level 3!