Message from @porco
Discord ID: 463367726287683607
I really can't add any search complexity
>open pornspider
>click teen
>teenage guys masturbating in their own mouth
<:woah:333623269674713098>
Not even basic ```csharp
if ( Video.timeDuration > userFilter ) { return vid }```
Though I have no idea about hardware usage of pornspider
if I do it exactly like that, it will slow down your search by about 15-20 min / request because it would have to iterate through 15 million videos
I use a lot of btree indexes
What about splitting the database?
0-5 > db1
10-15 > db2
20+ > db3?
then I can't use the vector-space formula anymore if I chunk the database
I just need more ram
download mowe wam
I ordered hardware on ebay
3200 $ in server hardware
and I rented a colocation space for 160 $ / month nearby my home
then, I will be able to add whatever filters are possible
because I will have 196 GB ram
and 24 cores
but really, splitting the database would only make it slower
my formula to decide if a video is relevant right now is `w = (log(dtf)+1)/sumdtf * U/(1+0.0115*U) * log((N-nf)/nf)`
```
dtf is the number of times the term appears in the document
sumdtf is the sum of (log(dtf)+1)'s for all terms in the same document
U is the number of Unique terms in the document
N is the total number of documents
nf is the number of documents that contain the term
```
>math
🤢
```
U is the number of Unique terms in the document
N is the total number of documents
nf is the number of documents that contain the term
```
these won't work if I split the db
and you will get shit results
my way right now is OK the problem is the data quality
for this, I ordered myself a nice GTX 1080 TI to put into the server that I'm waiting for
about 30% of my database is tagged and categorized correctly, the rest is trash
I am working on a ML model to categorize the rest
1080ti? damn
I'm dumb so how come a GPU will help with things? <:GWchinaSakuraThinking:398950680217255977>
ml accel
Or will it boost the machinelearning shit
look at their examples
like the titanic survival rate
Never got too deep into it other than a forced "something" tutorial on MatLab
they are self-explanatory
Though, ideally, you should be grabbing something like Volta if you move to a different ML framework, like Tensorflow
ohshit
I didn't knew C# had ML