Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Feature Request): Count extension user dislikes #109

Open
1 of 2 tasks
niwla23 opened this issue Nov 30, 2021 · 108 comments
Open
1 of 2 tasks

(Feature Request): Count extension user dislikes #109

niwla23 opened this issue Nov 30, 2021 · 108 comments
Labels
enhancement New feature or request

Comments

@niwla23
Copy link

niwla23 commented Nov 30, 2021

Extension or Userscript?

Extension

Request or suggest a new feature!

From my undestanding, right now this extension only fetches data from the youtube api and archives / displays it.
I would suggest adding a extension (dis)like count. So when people connect their Google account and press the dislike button, it will send a request to the backend and show that below the normal like / dislike ratio.

Ways to implement this!

This could be implemented by using Google OAuth to make sure users don't spam the dislike endpoint.
Then you just add an event listener on the like / dislike buttons and send an authenticated request to the backend. Below the yt like count you just add another like / dislike ratio (maybe just a bar that shows the values on hover, but that is a design thing.)

Can you work on this?

  • Yes
  • No
@niwla23 niwla23 added the enhancement New feature or request label Nov 30, 2021
@niwla23 niwla23 changed the title (Feature Request): Count extesnion user dislikes (Feature Request): Count extension user dislikes Nov 30, 2021
@MrRobinhq
Copy link

It could also estimate actual dislikes based on that, just take the ratio of yt api likes (which will hopefully remain public) and extension likes and solve for the unknown yt dislikes by assuming the same ratio between extension dislikes and yt api dislikes.

@k98kurz
Copy link

k98kurz commented Nov 30, 2021

It does not have to be an oauth integration. Any Sybil resistance method would work, e.g. scraping user data to ensure someone is logged in. Also, undisliking would have to be taken into account, which would require scraping the user's dislike state from the page.

@niwla23
Copy link
Author

niwla23 commented Nov 30, 2021

It does not have to be an oauth integration. Any Sybil resistance method would work, e.g. scraping user data to ensure someone is logged in. Also, undisliking would have to be taken into account, which would require scraping the user's dislike state from the page.

whats a Sybil resistance method?

@niwla23
Copy link
Author

niwla23 commented Nov 30, 2021

to ensure someone is logged in.

That basically sound like client side security...
EDIT: or do you mean taking the session cookie and using that? That would be a security hole I guess and most people would not be comfortable with an extension stealing their login cookies (is that even possible?)

@niwla23
Copy link
Author

niwla23 commented Nov 30, 2021

It could also estimate actual dislikes based on that, just take the ratio of yt api likes (which will hopefully remain public) and extension likes and solve for the unknown yt dislikes by assuming the same ratio between extension dislikes and yt api dislikes.

I do not think they will leave the like/dislike ratio in there as you could actually just calculate it from the likes * ratio.

@k98kurz
Copy link

k98kurz commented Nov 30, 2021

Whatever mechanism is chosen, it should optimize for accuracy and low API server load for good scalability. This extension will probably be used by tens of millions of people, so the server-side overhead should be minimized where possible. If accuracy/Sybil resistance can be accomplished without the oauth overhead, it would be beneficial. The more work that is done client-side, the more scalable the system will be. I'm not entirely familiar with the API used by Google/ThemTube, but the ideal system would be for the client to use some authenticated result from the API that is easy and quick for the server to verify.

@sy-b
Copy link
Contributor

sy-b commented Nov 30, 2021

EDIT: or do you mean taking the session cookie and using that? That would be a security hole I guess and most people would not be comfortable with an extension stealing their login cookies (is that even possible?)

Yes, extensions can read & edit cookie data. And, I am against using cookies for identification.
For cookies, we can send its hash & expiration date for identification & delete the hash after that date. But it still wouldn't work as expected.

Even better way

  1. Ask permission from user & inform how to turn this off and how to delete their data.
  2. Make sure the user is signed in
  3. Hash the Channel ID (not channel's custom link) with salts to anonymize it (on clients side)
  4. Use cryptography magic and encrypt it.
  5. Send it to database.
  6. This way we can give users a history of disliked video if they want it.

@niwla23
Copy link
Author

niwla23 commented Nov 30, 2021

EDIT: or do you mean taking the session cookie and using that? That would be a security hole I guess and most people would not be comfortable with an extension stealing their login cookies (is that even possible?)

Yes, extensions can read & edit cookie data. And, I am against using cookies for identification. For cookies, we can send its hash & expiration date for identification & delete the hash after that date. But it still wouldn't work well

Even better way

1. Ask permission from user & **inform** how to turn this off and **how to delete their data**.

2. Make sure the user is signed in

3. Hash the Channel ID (not channel's custom link) with salts to anonymize it (on clients side)

4. Use cryptography magic and encrypt it.

5. Send it to database.

6. This way we can give users a history of disliked video **if they want it**.

this will still not prevent users from spamming the endpoint. It can be easily exploited by scraping millions of channel IDs and sending them to the server. They are not unique nor authenticated, anyone can grab a random channel ID, hash it and start a dislike storm against a video.

So the only options I see for having unique, authenticated, youtube api independent dislikes:

  • Google OAuth
  • Custom auth system (hard because you will need some sort of verification, at least a captcha if not phone number. Google has these anyways)
  • Scraping the cookies and then updating the cookie by the last cookie after it has expired (feels wrong)

My personal preference would still be google oauth because extension maintainer will not need to maintain user data and they are hard to mass-create. Also, most people on youtube will have a google account anyways so it kind of makes sense.
Only problem I could see is google banning this extension from using their OAuth bc they don't like it.

I am also not sure if a dislike history would be that useful, but as that seems not too hard to implement, why not

@sy-b
Copy link
Contributor

sy-b commented Nov 30, 2021

(I was going to append my previous comment with this)

All of this👇 makes only makes sense if using OAuth is not possible (otherwise - ignore it)

By "cryptography magic", I meat, whichever way to generate, store and send keys. That even includes - if a cookie is required by the extension to store this database.

Aim of "cryptography magic":

  • no malicious code from anywhere (e.g. other extension or script) on client side should be able to affect its counting.
  • preventing misuse of users hash/ whatever ID thing we are using
  • protecting server's data from vandalization.

If the ID is to be made more anonymous (might be a bit overkill)

  • That hash shouldn't be used as permanent Identification for the user. That may pose a privacy risk (if a malicious actor intents).
  • For permanent ID on Server, I suggest to use a random ID generated from 1st obtained hash & keep changing the salt in the hash from time to time (e.g. monthly/weekly) so that misusing generated hash is difficult.
    (permanent ID )---( temporary hash )---(user)
    Also make a 2nd hash & discard the 1st hash on registration. i.e. don't make client side ever use 1st hash.
  • I think first salt should be sent with the first hash. So that in case we completely miss the track of user, we can use it for re-identification. ⚠But we will need to make a super hardened secure method to do it. (let's leave it for future)

@sy-b
Copy link
Contributor

sy-b commented Nov 30, 2021

this will still not prevent users from spamming the endpoint. It can be easily exploited by scraping millions of channel IDs and sending them to the server. They are not unique nor authenticated, anyone can grab a random channel ID, hash it and start a dislike storm against a video.

Ok, lets use hash of channel with cookie hash
This will definitely require more efforts as cookie will keep changing.

Anyways OAuth is better.

@sy-b
Copy link
Contributor

sy-b commented Nov 30, 2021

Some pages are only accessible to channel owner.
Is it possible to use that somehow for verification?

An inconvenient idea is to ask user to make a private video/playlist & verify if that is accessible.
But still, the Yes/No response to server should be encrypted somehow to prevent that dislike storm.

Maybe, a combination of all of these can be used

  • Channel ID hash
  • Cookie hash
  • Pseudorandom salts for hashing. Some last characters will be decided by server.
  • Channel "settings page" accessibility

Can the owner of this repo confirm, if its possible to use OAuth for long.
I doubt, because this project is entirely funded by donations and I would like it to remain so.

@niwla23
Copy link
Author

niwla23 commented Nov 30, 2021

How would you verify a cookie hash? It could be any made up string put into a hashing algorithm, channel IDs can be scraped and how are you going to verify that the settings page is actually accessible? The client could just lie about it, except for when you send an unhashed session cookie to the server for it to verify it.

I doubt, because this project is entirely funded by donations and I would like it to remain so.
I was assuming it is free, but that seems to be only true for up to 50K users which is already less than this extension has.

@sy-b
Copy link
Contributor

sy-b commented Nov 30, 2021

My initial idea was to use cookie hashes for verification once the user is registered.

⚠ I am temporarily ignoring

  • malicious scripts
  • sever side problems as database leaks
  • ignoring hash collisions
  • modification of YT's Channel setting webpage by any means.

Here is my incomplete plan for registration-

  1. Extension will somehow (I am still thinking on this) verify that its the real owner.
  2. Initiate the talk to server & send the channel ID.
  3. A random string will be generated & stored by server to be used as salt & sends it to extension.
  4. Extension will hash the Channel ID with that salt.
  5. Send that hash to sever & forget the salt string
  6. Server will verify if the hash is generated from its given salt. (now it may forget the channel ID, but must store that 8-10 digit salt string. If the user looses the temporary hash, this can be used to verify)
  7. Now this 1st hash will act as master ID.
  8. Extension will generate another salt B and send it to server.
  9. Salt B will be used by both to generate another hash for use.
  10. Server will keep a database of linkage of Master ID with new hashes.
  11. Server may send a cookie for Identification. (if we can do this part well, then Google's cookie will only be needed to verify channel's ownership)
  12. Now, if we decide to use Google's cookies & user allows, then we can sent the hashes of that cookie to server to be used as verification.

Now for Verification part (after registration), I have 2 options:

  • Verify using that 2nd hash as done in previous method.
  • Don't send hashes, instead some data which also depends on hashes & both cookies. (i am still thinking on this)

For verification for ownership before registration & before signing-in - check (I am not sure how):

  • if channel's setting page is available
  • Read channel ID from the setting's page

Notes:

  • In my opinion it'll be better if salts are generated as by diffie hellman keys (maybe, except for the first time).

  • I am having a flow chart in my brain which is constantly evolving (& I need someone other than me to find flaws) which I first need to make somewhere. My schedule is also packed till Christmas, so I have near to no thinking time.

  • We can also get user's info from from the page, but I think the user must be anonymized as much as possible. So lets not use anything which can directly identify the user (except channel ID which is publicly visible anyways & will be discarded after registration). If possible, not even email for verification.

  • My main idea is that if a user can sign in into YT, then it should be possible to verify their channel ID. Just we need to find out how. So I want to use the "User's ability to sign in into YT" as the verification of ownership to prevent database vandalization.

  • Users must have ability to delete their data

  • I know This contains many flaws, & incomplete bits, but I cannot do it alone & I am not expert in authentication over internet.

@niwla23
Copy link
Author

niwla23 commented Nov 30, 2021

So the missing bit is some sort of information that can be used for verification and is only available to the user logged in, but is also not confidential.
We just need to verify this once, from then on you can basically just give it a long lived jwt or session key to authenticate with the backend.

I have one idea that may be relatively non-complicated to the end user:

Ask the user to change their account name temporarily to something with a verification code send by the server:

  • Client asks server for verification token for their channelId
  • Server sends random 6 digit token
  • Client tells the user to change their username to [RYD-VERIFY:<6-digit-token>]
  • Client informs server that change has been done
  • Server rechecks, if check succeeds it sends a long lasting jwt that can be used from then on with all requests to change likes / dislikes. That JWT also contains a randomly generated user id which is the only thing the server has to store permanently when it comes to userdata. Likes and Dislikes can then be just be stored as lists / M2M relationships of user ids.

Those endpoints should be extremely ratelimited to prevent bruteforce

@niwla23
Copy link
Author

niwla23 commented Nov 30, 2021

related: ajayyy/SponsorBlock#1039 (comment)

@LeoDog896
Copy link

Maybe yall would be interested in https://gun.eco ? It's a decentralized database that would be perfect for this situation.

@Fry98
Copy link

Fry98 commented Nov 30, 2021

I think the best idea is to go with Google OAuth for authentication and this project might also be a good candidate for AWS Lambda backend? Serverless should scale well without too much hassle setting up.

@sy-b
Copy link
Contributor

sy-b commented Dec 1, 2021

I think the best idea is to go with Google OAuth for authentication and this project might also be a good candidate for AWS Lambda backend? Serverless should scale well without too much hassle setting up.

AWS Lambda and Google OAuth both are not free.

I doubt, because this project is entirely funded by donations and I would like it to remain so.

@sy-b
Copy link
Contributor

sy-b commented Dec 1, 2021

Client tells the user to change their username to [RYD-VERIFY:<6-digit-token>]

That's a good option & if user permits,

  • this must happen in less than a second.
  • the user must not notice
  • if in any case anything fails for whatever reason (like crashes, power cuts), the username must be edited to normal again! We don't want users to be spooked by some random looking username which they didn't choose.

Problem:
from: https://support.google.com/youtube/answer/3046484

  • If your channel is verified, it will stay verified unless you change your channel name. If you change your channel name, the renamed channel won’t be verified, and you'll need to reapply.

That leaves us with 1 question:
What to do with Verified accounts?

(or should we use the ability of making private/unlisted playlists instead?)

@sy-b
Copy link
Contributor

sy-b commented Dec 1, 2021

Maybe yall would be interested in https://gun.eco ? It's a decentralized database that would be perfect for this situation.

That's a good option
But I think the user also must prove the ownership of their channel, before registration & signing in every time.
That is a bit messy part.

@sy-b
Copy link
Contributor

sy-b commented Dec 1, 2021

Instead of changing username, lets use Channel info for that purpose (after user permits).
& rest as described by @niwla23

@sy-b
Copy link
Contributor

sy-b commented Dec 1, 2021

So the process should roughly look like this?

(👇 for registration and subsequent sign-ins)

  1. Initiate the talk to server & send the channel ID.
  2. A long random string will be generated by server & sent to client.
  3. Client temporarily will add that to channel's About Tab.
  4. After a delay, server will scrape the channel's About Tab for that string.
  5. If it matches, then server will proceed to 7.
  6. After a fixed delay , if server doesn't responds, client will erase the string from About Tab.

(👇 only for registration/sign-ups)

  1. A random string will be generated & stored by server to be used as salt & sends it to extension.
  2. Extension will hash the Channel ID with that salt.
  3. Send that hash to sever & forget the salt string.
  4. Server will verify if the hash is generated from its given salt. (now it may forget the channel ID, but must store that 8-10 digit salt string. If the user looses the temporary hash, this can be used to verify)
  5. Now this 1st hash will act as master ID.

After that, we have many ways to verify the user. I think instead of relying on any one, we should use a combination of these.

@k98kurz
Copy link

k98kurz commented Dec 1, 2021

This would be fairly easy if one of the exposed Google auth APIs created a JWT signed with a private key (edit: ES[256|384|512], RS[256|384|512], or PS[256|384|512] alg), because then the authentic public key could be retrieved, cached, and used for quick verification of the user token. But assuming that is not possible, I would further optimize the registration process like this:

  1. Initiate the talk to the server; send the channel ID; retrieve a token created by the server comprised of encrypted JSON of pending-status, client IP, channel ID, and expiration date. This could be a JWE structure or a simple, custom system of base64 encoded [iv, ciphertext, hmac] joined by periods with the plaintext being the values from the previous sentence encoded as JSON.
  2. Client temporarily prepends that token to an editable field in the channel's About Tab.
  3. Client then sends the URI for the channel About Tab using the previously obtained token as a temporary auth token. This step may possibly include a CSS selector path for the specific field, in which case only the client code would need to be updated when the field location changes -- it could even have a simple system where the user clicks in a field to select it (event listeners on all fields that get removed once one has been fired).
  4. Server validates the supplied auth token, validates that the supplied URI maps to the channel ID, and checks the About Tab. If it worked, it returns a new encrypted token to the client specifying authenticated-status, channel ID hash, and expiration date; if the supplied token or URI failed validation, or if the URI content verification failed, it returns an error.
  5. Client removes the prepended data from the About Tab field upon the earlier of step 4 completing, step 4 having a network error, or some set timeout.

At this point, the client has a token that can be verified by the server as authenticating that user, and the database does not have to store any auth data throughout the entire process since it is all handled by encrypted tokens. The tokens would be unforgeable without compromising the key(s) stored in the server environment.

As for the hashing scheme, I do not understand its purpose, so I cannot comment as to whether or not it would work. It seems like an unnecessary complication. If the goal is to optimize privacy for further API actions, then the channel ID can be included in the final token in a hash form rather than in plaintext, and even then the token will only be decryptable by the API server.

@sy-b
Copy link
Contributor

sy-b commented Dec 1, 2021

Yes, using channel hash was for privacy.
In case the server database ever gets compromised, it should reveal as minimum info as possible.
(imagine after a Db leak people making videos like "<channel name>'s most disliked channels", Why "<famous channel name> hates these")
I suggest - if possible, even if the (unencrypted) Db is made public, it must be useless for malicious actors.

Your method is really good,
but I think the initial computations could be made lightweight

@sy-b
Copy link
Contributor

sy-b commented Dec 1, 2021

I had these aims in my mind for the process:

(& I have added some)

  1. Computational, network and storage must be minimum on the server for points 2&3.
  2. Must be easily Scalable & cheap (after all this project is funded by people).
  3. In case of DDoS attack or any attack for vandalization, the server must be able to handle the load, and then filter attacking IP addresses if possible. If the 1st request sent to server demands heavy work, & if the server does it, then that'll make it easy for attackers to create problems. That's why I was trying to make the initial steps as lightweight as possible for the server. I feel that only authenticated users deserve error messages form server (prevent that as well if possible - for point 1).
  4. Stay away from Google's cookies (wherever & whenever possible; only use these as last resort). Also because cookies keep changing. Google will definitely not like us using their cookies.
  5. Anonymize the user as much as possible. (hence the master ID which is never exposed but is linked to another temporary ID e.g. JWT, hash, some weird combo, etc, whatever performs well )
  6. The database, if made publicly available, must be garbage for nuisance creators.
  7. Minimum or no user intervention because "Best Designs are Invisible" & point 8.
  8. Must not freak out user in any way. Some user are paranoid & most only know how to enjoy the service. If we are appending the About Tab, that may freak out them, hence, this must be done silently & all traces must be removed, whether process was successful or not.
  9. Prevent double registration of the same user.
  10. The master ID and the stored salt, should be enough to verify the users, but must be used only in that case when every other identification method is unavailable/lost (like signing-in on new device).

Feel free Add More and subtract

@niwla23
Copy link
Author

niwla23 commented Dec 1, 2021

@sy-b :

(:point_down: for registration and subsequent sign-ins)

I don't see why we would want to do that on every sign in. Why not just create a very long lasting JWT (which contains a userid) and send that to the server every time?
Not a security expert though, maybe this is an issue.

@niwla23
Copy link
Author

niwla23 commented Dec 1, 2021

image

@k98kurz
Copy link

k98kurz commented Jun 21, 2022

Iirc, when Dr. Adam Back invented hashcash, he made the difficulty parameter tunable to adapt to increasing computer power and network utilization. Perhaps instead of authenticating users, we just need some metric for adjusting the proof-of-work difficulty to throttle clients as an adaptive way to combat spam, lowering the difficulty after the spam attack/high throughput period ends.

@cyrildtm
Copy link
Contributor

adjusting the proof-of-work difficulty to throttle clients as an adaptive way to combat spam

it's technically correct, but I doubt if it's effective against any dedicated attackers.

@sy-b
Copy link
Contributor

sy-b commented Jun 21, 2022

  1. Mobile devices will suffer.

  2. GPUs & ASICs will still be at an advantage.
    Dedicated attackers will probably use GPUs or even ASICs to create IDs

@cyrildtm
Copy link
Contributor

what about using captcha?

  • make user registration mandatory for voting and only when extension voting is enabled (default off)
  • add screenshots and description in extension front page and say there's a menu when you click the extension button
  • upon register, get the random ID the same way (I don't like crypto.random or whatever that is though, why not just get random numbers)
  • give user a register link containing that random ID. save that ID in storage so it won't get lost.
  • user copy-paste that link so they know nothing else is uploaded. The link goes to a captcha challenge. When the challenge is solved, this ID is marked as registered. Display a bold and loud message saying you must go back to the extension menu and mark it as registered.
  • alternatively, when voting for the first time, check with server if that ID is registered, and mark it in storage if yes. If no, show a message saying you need to register. but I don't know what kind of message is better (in-page hover etc.)
  • alternatively, return a dedicated server error code if the user ID is invalid/unregistered
  • in a nutshell, replace crypto.digest with a captcha link

image

does that mean a call is counted even if it fails? That would be stupid

  • current users are not affected if they have the registered variable in storage

@sy-b
Copy link
Contributor

sy-b commented Jun 21, 2022

what about using captcha?

Yeah, that was one of the suggestions


make user registration mandatory for voting

Might reduce vote submissions. Users prefer "hassle free" set up.
I think, they should be asked to solve the captcha when they click on 👎.

only when extension voting is enabled (default off)

👍, but I would like the default to be "On".


Captcha surely raises the bar, but well committed adversaries can simply use Captcha solving services .
For e.g. 2captcha.com charges $1 for 1k captchas and also provides solutions for reCAPTCHA v2 & v3

@sy-b
Copy link
Contributor

sy-b commented Jun 21, 2022

Or we can self-host this https://github.com/ericwang401/captcha

(not serious; repo link edited; actual link https://github.com/ericwang401/chess-captcha)

@JacksonChen666
Copy link
Contributor

JacksonChen666 commented Jun 21, 2022 via email

@cyrildtm
Copy link
Contributor

@sy-b your link is messed up

@sy-b
Copy link
Contributor

sy-b commented Jun 21, 2022

yeah 😅

@k98kurz
Copy link

k98kurz commented Jun 21, 2022

Could also just make a custom proof-of-work system with tunable parameters. JavaScript is a pain in the ass compared to python, but could do something like this without too much development effort. It would be ASIC resistant in the sense that it can be made memory-intensive, and in the sense that if someone developed an ASIC, it could be invalidated by incrementing the tuning parameter.

@sy-b
Copy link
Contributor

sy-b commented Jun 21, 2022

👍

Assuming that we are going to implement just enough defenses, this can be fine.
But generally speaking, any proof-of-X or any cryptography related algorithm, must be audited thoroughly by experts before wide use. For this particular case your solution might suffice, but I am not sure.

Maybe we can stack multiple solution on top of each other.

@sy-b
Copy link
Contributor

sy-b commented Jun 21, 2022

from: https://softwareengineering.stackexchange.com/a/115120

You can't Prevent it, you have to provide incentives against it.

You problem is that your system has a built in incentive to create multiple accounts. Remove that incentive or give a better one for sticking with one account.

@cyrildtm
Copy link
Contributor

cyrildtm commented Jun 21, 2022

Two ways of registration --

  • The easy way: Use OAuth (may be the same way for creators)
  • The privacy-friendly way: Use random number and the chess puzzle

Now I've gotta learn chess. How? More youtube videos?

@sy-b
Copy link
Contributor

sy-b commented Jun 21, 2022

What if you could verifiably get the API response without getting user credentials?

@sy-b
Copy link
Contributor

sy-b commented Jun 21, 2022

Imagine if we make an app for creators which periodically auto-submits dislike data while running in the background. Can we get top 67% of creators to use it?

If creators themselves submit their data, then there's hardly any point of using users' data. This can reduce the "dislike bombing incentive" (on RYD's DB).

@cyrildtm
Copy link
Contributor

cyrildtm commented Jun 21, 2022

What if you could verifiably get the API response without getting user credentials?

Not sure what you mean?

If creators themselves submit their data, then there's hardly any point of using users' data. This can reduce the "dislike bombing incentive" (on RYD's DB).

That would be great. Add a flag/attribute: IsCreatorDataPresent

if ! IsCreatorDataPresent {
  record_vote();
}

Can we get top 67% of creators to use it?

You can't create a culture.* You can't even get 20% of the people vote on the same issue. You can't even get 20% of the people to care about to vote.

* unless you plan to become a cult leader.

@sy-b
Copy link
Contributor

sy-b commented Jun 21, 2022

Not sure what you mean?

Remember that TLS modification method? We should be able to use that for dislike button.

The basic idea:

  • RYD will search for resp.responseContext.consistencyTokenJar in youtube.com/youtubei/v1/like/dislike's response. This is missing if the video was disliked. (here resp is the endpoint's response)
    i.e. If I send dislike request twice, consistencyTokenJar will be missing from the 2nd request's response.
    This way RYD can know if a logged-in user is disliking multiple times.

  • For non logged-in users, it should respond with error 400. (Not 100% sure, but happened with me.)
    or resp.responseContext.mainAppWebResponseContext.loggedOut should be true
    or some other indication will hopefully be there.

Benefits

  • This way RYD can leverage the burden of spam protection on YT.
    So, even if I create 100 RYD IDs, I will be limited by the number of Google accounts I can use. Which is 1 or 2 for most legitimate users.

  • No signups required for RYD.
    Note: It will still possible to bloat RYD DB by just creating user IDs.


Drawbacks

  • If someone uses click farms services (or something similar) to spam the YT DB, and YT detects it, YT can discard that dislike before merging it into their main "verified" DB (or whatever that is).
    In this case, RYD will count that dislike, but YT won't.

  • While most of the info returned by the API endpoint is useless, and some data objects trackingParams and encryptedTokenJarContents change with every response, 2 objects datasyncId and country-type can be a bit concerning for some privacy paranoid users.

    • datasyncId seems mostly useless to me. I may be able to tell if the users is using a secondary channel.

      from: https://github.com/yt-dlp/yt-dlp/blob/master/yt_dlp/extractor/youtube.py#L524&#L525

      # datasyncid is of the form "channel_syncid||user_syncid" for secondary channel
      # and just "user_syncid||" for primary channel. We only want the channel_syncid

      The worst thing I can imagine to do with this is to keep a track of and make a list of related channels if possible.
      Note: videoID is not present in the response. Hence, it isn't possible to track individual viewing habits from this.

    • country-type - I am personally not concerned about this. idk about others




unless you plan to become a cult leader.

🤣

For some reason this exists:
https://markmanson.net/how-to-start-a-cult-and-save-the-world

"Save the world"?!? 🤨

@sy-b
Copy link
Contributor

sy-b commented Jun 21, 2022

For those who don't know / can't remember/find the TLS modification method:

2 variants were suggested

The discussion starts from here
https://discord.com/channels/909435648170160229/912841275974234122/926199717606600764


@SyntaxBlitz's variant

explanation starts from here https://discord.com/channels/909435648170160229/912841275974234122/926315380270571531

His first post:

SyntaxBlitz TLS modification 1st post Capture
(Time according to UTC+0)

TL;DR -
similar to what I described but the server & client have kind of switched roles


My Variant

(This is the summarization as posted on 30 Jan 2022)


Aim: Ask the client to get the data and submit it to RYD but make sure that the client doesn't modify it.

Note:

  • This is an extremely simplified version.
  • I hope you know how TLSv1.3 and proxy re-encryption works.
  • Client & RYD Server establishes a secure Connection using TLS. i.e. they share a common symmetric key
    And the communication between them is encrypted normally.

For TLSv1.3

RYD --> Client : Sends 'client hello' to the client for the YT server

Client --> YT : Sends the 'client hello' received from RYD to YT

YT --> Client : Responds with 'server hello'

Client --> RYD : Generates a key for symmetric encryption (optional). Sends the response and the key (optional) to RYD.

RYD Server : Verifies (using certificates) that the response is from YT. Calculates the symmetric key for YT (I'll refer to it as YT key). And uses this along with the client's key to generate a unidirectional proxy re-encryption key. I'll refer to this as PRE key. This PRE key can only re-encrypt ciphertext encrypted with the client's key.

RYD --> Client : Sends the RYD key to the client.

Client : Generates the request using the creator's API key. Encrypts it with the client's key. Re-encrypts it with the PRE key.

Client --> YT : Sends the encrypted request to YT.

YT --> Client : Responds to the requests.

Client --> RYD : Sends the response to RYD.

RYD Server : Decrypts the response with the YT key.

Done!

If you missed this:

  • Client cannot decrypt the YT's response because it isn't having the YT key.
  • In step 5 RYD verifies that the client is sending YT's response by using YT's certificate. RYD must already have YT's certificate.

Important Note

These posts have not been updated.

@sy-b
Copy link
Contributor

sy-b commented Jun 21, 2022

if we make an app for creators which periodically auto-submits dislike data while running in the background

using this

I do have some ideas, but I am unable to aggregate enough time. Maybe I can in somewhere between September & October.

@k98kurz
Copy link

k98kurz commented Jun 21, 2022

But generally speaking, any proof-of-X or any cryptography related algorithm, must be audited thoroughly by experts before wide use.

The idea is to use established hash algorithms to derive pseudorandom data which applies operations on other pseudorandom data, then hash the result. As long as the hash algorithms used to generate the pseudorandom data are secure, then the rest follows logically without requiring cryptography experts. The wrinkle with my proposal is that SubtleCrypto provides just 4 hash algorithms from 1 family, so it can't be as ASIC resistant as the python version with 8 algorithms from 4 families.

It is also worth noting that YT is not very good at preventing spam accounts. Idk if anyone else has been on YT recently, but every comment on every video gets a response from a spam bot account. Whatever YT is doing is clearly not enough, so I suggest adding hashcash to the system on top of whatever YT-specific system is used. Comment spam and dislike spam are not mechanically identical, but the logic is.

@SyntaxBlitz
Copy link

SyntaxBlitz commented Jun 21, 2022

I did end up posting a more in-depth explanation of the TLS shenanigans method (and something resembling a proof-of-concept) here: https://github.com/SyntaxBlitz/yt-dislike-fetcher

@sy-b
Copy link
Contributor

sy-b commented Jun 22, 2022

Another idea -

What if we use the hash of userID in datasyncId as RYD ID ?

@sy-b
Copy link
Contributor

sy-b commented Jun 22, 2022

Privacy Paranoid Folks might say - "You collect the videoID, then YT's userID and you also have hash of our IP address. Which means you know what we are watching!"

@cyrildtm
Copy link
Contributor

You can't have it both ways. Dislike is user data. Voting is providing user data. The flow of information doesn't disappear no matter how you change the method.

@sy-b
Copy link
Contributor

sy-b commented Jun 22, 2022

Possible solution:
https://www.npmjs.com/package/tor-request

We can send youtube.com/youtubei/v1/like/dislike's response over tor

@cyrildtm
Copy link
Contributor

Tor is okay for polling, but voting is vulnerable this way. How can you tell a user from a sweatshop monkey?

@sy-b
Copy link
Contributor

sy-b commented Jun 22, 2022

but voting is vulnerable this way

Can you explain?

@cyrildtm
Copy link
Contributor

#344 (comment)

@cyrildtm
Copy link
Contributor

Honestly, when this extension moves to anything other than straightforward HTTPS requests from my computer to the API server, I will quit using it. It's not worth risking my digital security against strangers on the open web just for some youtube videos. I'm only watching certain subscribed channels and occasional very focused searches, and no new channel subscriptions for a while. I don't care too much about like/dislikes anyway.
Hope this represents certain portion of the user base.

@sy-b
Copy link
Contributor

sy-b commented Jun 22, 2022

No description provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests