Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backports from youtube-dl #9523

Open
12 tasks
Grub4K opened this issue Mar 24, 2024 · 16 comments
Open
12 tasks

Backports from youtube-dl #9523

Grub4K opened this issue Mar 24, 2024 · 16 comments
Labels
solved-upstream The issue has been solved in youtube-dl. Do not make PR for this

Comments

@Grub4K Grub4K added the solved-upstream The issue has been solved in youtube-dl. Do not make PR for this label Mar 24, 2024
@pukkandan
Copy link
Member

Why not update #21 ?

@Grub4K
Copy link
Member Author

Grub4K commented Mar 24, 2024

IMO its easier to track these in a new issue, especially since they were suppose to make it to last release but didnt

@dirkf
Copy link
Contributor

dirkf commented Mar 24, 2024

I'm sure you'll feel free to propose any question/fix that comes up while doing this.

@Grub4K
Copy link
Member Author

Grub4K commented Mar 24, 2024

I'm not sure if [utils] base for int_or_none should be ported.

The applications would be eporner, fptplay, goplay, iqiyi, pbs, wat, xfileshare (why it was implemented in ydl), utils.ohdave_rsa_encrypt

Thoughts?

@dirkf
Copy link
Contributor

dirkf commented Mar 24, 2024

I posit the simplified code and alignment with int().

Related, I suggest kfloat_or_none() (implicit scale=1000) and Kfloat_or_none()(implicitscale=1024, suitable for a recent YT tbr fix). Ofc this can be done with functools.partial()`, when needed but these seem useful enough to be worth including, modulo naming.

@Grub4K
Copy link
Member Author

Grub4K commented Mar 24, 2024

That code is not pretty like it was before for several reasons imo. Id like to think what the others say, but I guess compat with youtube-dl would be a problem otherwise...


@bashonly and I have been debating more traversal helpers as well, and kilo_or_none was one of the helpers that came up a lot. Alternatively, one could use functional programming and create a scale function, so {scale(1000, int_or_none)} could be used in traversal for example

@pukkandan
Copy link
Member

Alternatively, one could use functional programming and create a scale function, so {scale(1000, int_or_none)} could be used in traversal for example

Do we really need syntactic sugar for partial(int_or_none, scale=1000)? If we do, how about int_or_none(Ellipsis, scale=1000)?

That code is not pretty like it was before for several reasons imo.

Can you explain?

btw, @dirkf. Any reason you set base=None as default? Code would be simplified with base=10, which is the default for int

@bashonly
Copy link
Member

Do we really need syntactic sugar for partial(int_or_none, scale=1000)?

IMO yes, because in the extractors where we use this, we tend to use it in multiple places. There are already several extractors that define a kilo_or_none helper or something to that effect. Better to standardize it and deduplicate code

Any reason you set base=None as default? Code would be simplified with base=10, which is the default for int

int_or_none(1) currently returns 1. int(1, base=10) throws a TypeError

@pukkandan
Copy link
Member

pukkandan commented Mar 25, 2024

int_or_none(1) currently returns 1. int(1, base=10) throws a TypeError

gotcha

There are already several extractors that define a kilo_or_none helper or something to that effect. Better to standardize it and deduplicate code

I only found NetEaseMusic. Others are currently using partial directly, no?

If we do, how about int_or_none(Ellipsis, scale=1000)?

Impl:

diff --git a/yt_dlp/utils/_utils.py b/yt_dlp/utils/_utils.py
index a4f69385c..3225149ac 100644
--- a/yt_dlp/utils/_utils.py
+++ b/yt_dlp/utils/_utils.py
@@ -1944,6 +1944,15 @@ def urljoin(base, path):
     return urllib.parse.urljoin(base, path)


+def curriable(func):  # TODO: Better name?
+    @functools.wraps(func)
+    def wrapper(*args, **kwargs):
+        if args[:1] == (..., ):
+            return lambda x: func(x, *args[1:], **kwargs)
+        return func(*args, **kwargs)
+    return wrapper
+
+@curriable
 def int_or_none(v, scale=1, default=None, get_attr=None, invscale=1):
     if get_attr and v is not None:
         v = getattr(v, get_attr, None)

Usage traverse_obj(obj, (key, {int_or_none(..., scale=1000)}) (... being literal). The decorator could also be applied to float_or_none and any other function that needs it.

@dirkf
Copy link
Contributor

dirkf commented Mar 25, 2024

When using a functional expression in a traversal, expressions can become extremely lisp-y with multiple nested parentheses: as a late convert to GvR's transformation of Lisp, I'm not unhappy with that, but simplifying such expressions is a good thing and the float_or_none shorthands would be a good case.

Existing uses with partial are candidates for replacement over time, surely. And the choice between kfloat_or_none/kilo_float_or_none and Kfloat_or_none/kibi_float_or_none might focus attention on bugs like #9489.

@dirkf
Copy link
Contributor

dirkf commented Mar 25, 2024

In fact, there are several tens of extractors that scale by 1000, maybe enough to justify a kint_or_none() definition too, or just a specialisation of partial/scale:

kilo = lambda fn: functools.partial(fn, scale=1000)

But that's going to be evaluated in each traversal expression, unless pre-assigned which vitiates the shorthand, whereas a utils definition will be evaluated once per program run if needed.

@dirkf
Copy link
Contributor

dirkf commented Apr 24, 2024

Looking at the list, the FFmpeg detection is much more sophisticated here but possibly the underlying problem described in ytdl-org/youtube-dl#32735 (comment) persists.

@pukkandan
Copy link
Member

We work around it in the CLI by setting ffmpeg location to a contextvar.

yt-dlp/yt_dlp/__init__.py

Lines 965 to 968 in 89f535e

# We may need ffmpeg_location without having access to the YoutubeDL instance
# See https://github.com/yt-dlp/yt-dlp/issues/2191
if opts.ffmpeg_location:
FFmpegPostProcessor._ffmpeg_location.set(opts.ffmpeg_location)
There are plans to formalize usage of contextvars via a .globals module in the future. Some other issues like utils disobeying log level etc can also be addressed using it.

@dirkf
Copy link
Contributor

dirkf commented Apr 24, 2024

And #2191 (comment) ?

@dirkf
Copy link
Contributor

dirkf commented Apr 25, 2024

Regarding the updated int_or_none(), I'm inclined to think that it's more consistent to have this behaviour (still consistent with how it's used now)

  • a string is converted to int if possible, using a base if provided
  • a number is converted to int if necessary/possible, regardless of any base provided.

Maybe a float literal should be accepted (as if round(float()) had been applied)? I can't imagine any sites, or actually anyone at all, using (say) hexadecimal [hexa]decimals as in x2a.1a = 42.1015625, so this would only apply with base None.

@pukkandan
Copy link
Member

And #2191 (comment) ?

Yes, that is still an issue. The proposed .globals will be a public API and cli_to_api could be updated to guide user into setting it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved-upstream The issue has been solved in youtube-dl. Do not make PR for this
Projects
Status: Release blocker
Development

No branches or pull requests

4 participants