Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do... something? about machine_learning/forecasting #8780

Open
tianyizheng02 opened this issue May 27, 2023 · 1 comment
Open

Do... something? about machine_learning/forecasting #8780

tianyizheng02 opened this issue May 27, 2023 · 1 comment
Labels
enhancement This PR modified some existing files

Comments

@tianyizheng02
Copy link
Contributor

Feature description

In reference to #7305, it's clear to me that @nandiya isn't coming back to fix the warnings in machine_learning/forecasting/run.py. With that said, I want to discuss some more fundamental issues that I have with the file and get other opinions on what to do about them.

  • At the risk of sounding harsh, I think some of the code was haphazardly written. The warnings mentioned in [PYTEST WARNING] Machine learning forecasting #7305 are mostly due to the code providing far too few input observations (like, literally only 4 observations). Adding more observations doesn't necessarily resolve the warnings either, as there are other bugs in the code (passing bad arguments into functions and incorrectly reading a CSV file) that prevent the code from fully running. To me, it almost seems like the file was never run or tested independently, as these bugs and warnings would've been caught by just running the code.

  • I suspect that the code could've been plagiarized (or at the very least uncredited). At the top of the file, it says this:

    this is code for forecasting
    but i modified it and used it for safety checker of data

    ... but modified from where? No sources were ever referenced.

  • This file isn't an algorithm, much less a machine learning algorithm. Forecasting is a very general statistical task, and there's no single algorithm for it. In actuality, this file uses three different statistical methods (linear regression, SARIMAX, and SVR) to complete a forecasting task. Why not simply implement each of those algorithms in separate files?

  • Related to the previous point, this file is clearly a how-to. The SARIMAX and SVR code relies entirely on pre-existing implementations in other packages.

My main question is what we should do about this file. Fixing the existing code would resolve the warnings brought up in #7305 and would address the first issue, but it doesn't really address any of the other issues. What do you all think we should do about this file?

@tianyizheng02 tianyizheng02 added the enhancement This PR modified some existing files label May 27, 2023
@FatAnorexic
Copy link
Contributor

Haven't contributed here yet, but I could write some of them out if everyone wants. I think linear regression already exists. However, would I shelve them under math or keep them in ML?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This PR modified some existing files
Projects
None yet
Development

No branches or pull requests

2 participants