Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ie/PornHub] add attributes, issue #9524 #9527

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

sky-cake
Copy link

IMPORTANT: PRs without the template will be CLOSED

Description of your pull request and other information

This PR adds model attributes, spoken language, and production attributes to the PornHub info extractor. It also fixes some of the broken PornHubIE tests.

Fixes #9524

Template

Before submitting a pull request make sure you have:

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

PornHubIE Test Logs

python3 devscripts/run_tests.py PornHubIE_all
Running ['pytest', '-Werror', '--tb=short', 'test/test_download.py::TestDownload::test_PornHub_all']
Running ['/usr/bin/python3', '-Werror', '-m', 'unittest', 'test.test_download.TestDownload.test_PornHub_all']
[debug] Loaded 1807 extractors
[PornHub] Extracting URL: http://www.pornhub.com/view_video.php?viewkey=648719015
[PornHub] 648719015: Downloading pc webpage
[PornHub] 648719015: Downloading m3u8 information
[PornHub] 648719015: Downloading m3u8 information
[PornHub] 648719015: Downloading m3u8 information
[PornHub] 648719015: Downloading m3u8 information
[PornHub] 648719015: Downloading JSON metadata
[debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, hdr:12(7), vcodec:vp9.2(10), channels, acodec, size, br, asr, proto, vext, aext, hasaud, source, id
[info] 648719015: Downloading 1 format(s): hls-1500-1
[info] Writing video metadata as JSON to: test_PornHub_648719015.info.json
[debug] Invoking hlsnative downloader on "https://ev-h.phncdn.com/hls/videos/201306/28/14084201/,200109_2050_720P_4000K,200109_2050_480P_2000K,200109_2050_240P_1000K,_14084201.mp4.urlset/index-f1-v1-a1.m3u8?validfrom=1711319047&validto=1711326247&ipa=72.140.162.251&hdl=-1&hash=misAqEoDasO%2F09xqxlX2SCszsak%3D"
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 90
[download] Destination: test_PornHub_648719015.mp4
[download] 100% of    1.02MiB in 00:00:00 at 3.58MiB/s
Skipping PornHub: Video has been flagged for verification in accordance with our trust and safety policy
Skipped test_PornHub_1
Skipping PornHub: This video has been disabled
Skipped test_PornHub_2
[debug] Loaded 1807 extractors
[PornHub] Extracting URL: http://www.pornhub.com/view_video.php?viewkey=ph601dc30bae19a
[PornHub] ph601dc30bae19a: Downloading pc webpage
[PornHub] ph601dc30bae19a: Downloading m3u8 information
[PornHub] ph601dc30bae19a: Downloading m3u8 information
[PornHub] ph601dc30bae19a: Downloading m3u8 information
[PornHub] ph601dc30bae19a: Downloading m3u8 information
[PornHub] ph601dc30bae19a: Downloading m3u8 information
[PornHub] ph601dc30bae19a: Downloading JSON metadata
[debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, hdr:12(7), vcodec:vp9.2(10), channels, acodec, size, br, asr, proto, vext, aext, hasaud, source, id
[info] ph601dc30bae19a: Downloading 1 format(s): hls-2835
[info] Writing video metadata as JSON to: test_PornHub_3_ph601dc30bae19a.info.json
[debug] Invoking hlsnative downloader on "https://cv-h.phncdn.com/hls/videos/202102/05/383080302/1080P_8000K_383080302.mp4/index-v1-a1.m3u8?YLun6I8kmbsWj-rMHovpmI-V2Bpldj8Pf139H2Ilg0BFDkSFQXq2i9PcV2hOGwQrXAoBmlOwn5WDbx3_WZHGKpe3ud4FQiXjyAnIjtP1WpRzUa-ttrBhoqdXAhhFBLqKeuWEf9tFbWlLD3D3VUjM0GRNVDd-tCTBvxpmafUxrfFp4aF6tw4CUQUlJ7e5EXQAI9jGuLFcYVk"
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 2043
[download] Destination: test_PornHub_3_ph601dc30bae19a.mp4
[download] 100% of  191.30KiB in 00:00:00 at 2.50MiB/s
[debug] Loaded 1807 extractors
[PornHub] Extracting URL: https://www.pornhub.com/view_video.php?viewkey=65a6ca42725f2
[PornHub] 65a6ca42725f2: Downloading pc webpage
[PornHub] 65a6ca42725f2: Downloading m3u8 information
[PornHub] 65a6ca42725f2: Downloading m3u8 information
[PornHub] 65a6ca42725f2: Downloading m3u8 information
[PornHub] 65a6ca42725f2: Downloading m3u8 information
[PornHub] 65a6ca42725f2: Downloading m3u8 information
[PornHub] 65a6ca42725f2: Downloading JSON metadata
[debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, hdr:12(7), vcodec:vp9.2(10), channels, acodec, size, br, asr, proto, vext, aext, hasaud, source, id
[info] 65a6ca42725f2: Downloading 1 format(s): hls-3969-1
[info] Writing video metadata as JSON to: test_PornHub_4_65a6ca42725f2.info.json
[debug] Invoking hlsnative downloader on "https://ev-h.phncdn.com/hls/videos/202401/16/446618441/,1080P_4000K,720P_4000K,480P_2000K,240P_1000K,_446618441.mp4.urlset/index-f1-v1-a1.m3u8?validfrom=1711319069&validto=1711326269&ipa=72.140.162.251&hdl=-1&hash=cQh1Jmxnsw5hTsBfmOtGphbB1Yg%3D"
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 89
[download] Destination: test_PornHub_4_65a6ca42725f2.mp4
[download] 100% of  408.68KiB in 00:00:00 at 2.29MiB/s
.
----------------------------------------------------------------------
Ran 1 test in 24.779s

OK

@seproDev seproDev added the site-enhancement Feature request for some website label Mar 25, 2024
Comment on lines +562 to +564
'production': extract_list('production'),
'language_spoken': extract_list('langSpoken'),
'model_attributes': get_model_attributes(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean they cannot be added to the extractor?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there are any nice ways to map this data to valid info dict fields. You could propose adding new fields, but due to being very site-specific, I don't think they would be accepted.

If you need this data, you should look in to creating a plugin https://github.com/yt-dlp/yt-dlp?tab=readme-ov-file#developing-plugins
You can even re-use your code changes submitted here.

Copy link
Member

@pukkandan pukkandan Mar 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

language could be put inside the audio formats. Would it be reasonable to merge attributes/production with tags/categories respectively? Or are they too different?

@bashonly bashonly added pending-fixes PR has had changes requested NSFW labels Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NSFW pending-fixes PR has had changes requested site-enhancement Feature request for some website
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add "Language Spoken", and "Production" to PornHub IE
4 participants