Message from @porco
Discord ID: 463368293240274948
and 24 cores
but really, splitting the database would only make it slower
my formula to decide if a video is relevant right now is `w = (log(dtf)+1)/sumdtf * U/(1+0.0115*U) * log((N-nf)/nf)`
```
dtf is the number of times the term appears in the document
sumdtf is the sum of (log(dtf)+1)'s for all terms in the same document
U is the number of Unique terms in the document
N is the total number of documents
nf is the number of documents that contain the term
```
>math
🤢
```
U is the number of Unique terms in the document
N is the total number of documents
nf is the number of documents that contain the term
```
these won't work if I split the db
and you will get shit results
my way right now is OK the problem is the data quality
for this, I ordered myself a nice GTX 1080 TI to put into the server that I'm waiting for
about 30% of my database is tagged and categorized correctly, the rest is trash
I am working on a ML model to categorize the rest
1080ti? damn
I'm dumb so how come a GPU will help with things? <:GWchinaSakuraThinking:398950680217255977>
ml accel
Or will it boost the machinelearning shit
look at their examples
like the titanic survival rate
Never got too deep into it other than a forced "something" tutorial on MatLab
Though, ideally, you should be grabbing something like Volta if you move to a different ML framework, like Tensorflow
ohshit
I didn't knew C# had ML
I always thought that for ML you'd use python and only python
my priorities right now are
- get better hardware
- improve site performance
- improve database quality to deliver better search results
- add more features
add more features would be your filters
`- add more features`
pornspider pass <:GWnanaPepoHype:392308469488680961>
but since I do this on a freetime basis and you can see I have more important tasks before that, it might take a while
because filters are useless, if half my datasets don't have a duration in the first place
first, I need to increase the data quality
Also .NET ML is much nicer to use than python
give it a try
Microsoft is doing everything right except for windows, these days
I will, once I finish the MVC course
MS is doing Windows correctly
>
just not consumer Windows
>
Windows Server sucks ass
there's a reason I'm running .NET pornspider on Linux with mono