Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] fetch public trades #9066

Open
wants to merge 248 commits into
base: develop
Choose a base branch
from

Conversation

TheJoeSchr
Copy link
Contributor

@TheJoeSchr TheJoeSchr commented Aug 18, 2023

Orderflow

Summary

This PR enables to use ccxt fetch_public_trades so freqtrade can use trades data for backtesting and trading

Quickstart:

  • use --dl-trades to fetch trades for timerange
  • enable using trades in config.json
"exchange": {
   ...
   "use_public_trades": true,
}
  • set orderflow processing configuration in config.json:
"orderflow": {
   "scale": 0.5,
   "stacked_imbalance_range": 3,
   "imbalance_volume": 1,
   "imbalance_ratio": 300
 },

TODO

  • [docs] write (new?) docs about how to access this new trades data in populate_indicators etc
    • dataframe['trades']
    • dataframe['orderflow']
    • dataframe['delta']
  • [docs] how to use --dl-trades to get trade data
  • [docs] which configuration options are available and how to use them
  • [tests] unit tests exist, but they are using too big for github test files. so I need to shrink them and rewrite tests to still make sense

from previous feedback by @xmatthias :

  • Some diffs are a bit strange at first glance (especially in the exchange class), so expect some questions once i look at it more carefully (might be the webUI doing the diffs oddly if 2 functions are next to each other and are similar enough though) ...

  • and i usually don't like random formatting changes (like - an unnecessary linebreak in a function that's not even touched otherwise) => fixed with commit 137ee07

  • i also assume we'll want to move some things around in the end (converter seems to become quite big now) - but i can eventually help with that.

What's new?

@TheJoeSchr TheJoeSchr marked this pull request as draft August 18, 2023 11:21
@TheJoeSchr TheJoeSchr force-pushed the feature/fetch-public-trades branch 2 times, most recently from a5660c6 to 137ee07 Compare August 18, 2023 11:39
@xmatthias xmatthias added the Enhancement Enhancements to the bot. Get lower priority than bugs by default. label Aug 18, 2023
@biiiipy
Copy link

biiiipy commented Aug 19, 2023

This is awesome! Orderflow has a lot of valuable information, can't wait to start experimenting with this!

@xmatthias xmatthias linked an issue Aug 19, 2023 that may be closed by this pull request
@dijvar

This comment was marked as off-topic.

@xmatthias
Copy link
Member

xmatthias commented Sep 25, 2023

@dijvar it's an open PR. Docs will be written before merging the PR (which the PR description also mentions).
Please don't pollute the PR discussion with pointless requests, thanks.

@TheJoeSchr you may want to look at the conflicts (don't just look at the conflicting lines though ...).
trades datahandling changed right around the time (one PR number before your's 😆 ) you submitted this PR (#9065) - so i suspect some of the code you've written is now already available - or types won't align etc.

I think the diff of #9065 will be helpful to show what changed .... i don't think much else changed ... but these changes WILL impact this PR for sure.

@TheJoeSchr
Copy link
Contributor Author

TheJoeSchr commented Sep 25, 2023

I think the diff of #9065 will be helpful to show what changed .... i don't think much else changed ... but these changes WILL impact this PR for sure.

thanks for the headsup @xmatthias . I think the overlap is minimal, because my "trades" are about L2 data aka orderflow trades and that PR is about trades the user/bot send to the exchange to execute.

but the naming overlap is unfortunate, I'm not sure how to handle it so there won't be any more confusion

@TheJoeSchr
Copy link
Contributor Author

Please, write a short docs to use it. Thank you very much for the work :)

Thanks for showing interest, this helps keeping my motivation up to finally write the docs. did you try the quickstart on top of this PR though? If yes, what steps didn't work?

Copy link
Member

@xmatthias xmatthias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a few things that i don't like and make a review very difficult
For example:

  • changes that don't belong here (see explicit comment)
  • docstring (and signature) changes on methods that should not have been changed, causing pointless diffs (refresh_latest_ohlcv() - for example)
  • reintroduction of arrow to exchange (which has been removed in Add datetime helpers, reduce arrow usage to a minimum #8661, and was replaced with internal helpers directly for datetime)
  • duplicate methods in the same class (one will overwrite the other)

Due to the above, doing a proper review is currently near impossible - a the diff is unnecessarily large (especially in the exchange file).
Please do a review yourself (for example vscode's github pull requests extension can assist you with this by showing you the diff locally, allowing you to change what you don't like), comparing the differences - and identify what you actually INTENDED to change. That's something that's VERY difficult for me to identify.

Things you think should be changed but are not directly related to this pr (for example the max_calls point) - should either be removed, or extracted into individual PR's.
this becomes increasingly important with bigger PR's - small pr's combining 2-3 things - not really a problem, usually
but in a big feature like this - very problematic - as it draws focus apart - having high potential to introduce unwanted bugs due to these "non-directly" related changes.

freqtrade/data/dataprovider.py Outdated Show resolved Hide resolved
freqtrade/exchange/exchange.py Outdated Show resolved Hide resolved
freqtrade/exchange/exchange.py Outdated Show resolved Hide resolved
freqtrade/exchange/exchange.py Outdated Show resolved Hide resolved
freqtrade/exchange/exchange.py Outdated Show resolved Hide resolved
freqtrade/exchange/exchange.py Outdated Show resolved Hide resolved
freqtrade/exchange/exchange.py Outdated Show resolved Hide resolved
@Axel-CH
Copy link
Contributor

Axel-CH commented Mar 11, 2024

Hi @TheJoeSchr, first of all, thank you for your contribution 👍🏽 ; Just a quick question: the 'plot-dataframe' command is expected to work with trades data out of the box or some update are needed?

@xmatthias
Copy link
Member

the 'plot-dataframe' command is expected to work with trades data out of the box or some update are needed?

I don't think it is - but I'd also not invest time into plot-dataframe at this point - which i do consider a deprecated functionality (though with no immediate plans of removal).
Instead - plotting should be performed through freqUI - which is more flexible - and more importantly - WAY better when data becomes even slightly bigger (think - more than a few months of 5m data).

@Axel-CH
Copy link
Contributor

Axel-CH commented Mar 11, 2024

I'm using plot dataframe extencively, and it is allowing to do advanced indicators development. I'm not convinced that freq UI could cover efficiently such use cases.
So I still think plot dataframe is needed in the future.
If you need contribution on this part i will do it

@xmatthias
Copy link
Member

xmatthias commented Mar 11, 2024

@Axel-CH this is not the place to discuss this topic in depth.

There's no plans on having support for plot-dataframe included in this PR. This doesn't exclude eventual followup PR's eventually - but i don't see support for plot-dataframe as a requirement for this feature or pull request.

Copy link
Member

@xmatthias xmatthias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole exchange area (refresh_latest_trades() and children of it) will need some tests (ignore the already existing functions - tests for these exit).

The logic "is refresh needed" and consorts is quite prone to bugs (i know that from the ohlcv part) - which tests can help discover.
you should be able to reuse at least part of the logic for refresh_latest_ohlcv() for these tests - which should get you most of the way (though it's new logic - so needs dedicated tests to ensure it's not broken now - and won't break in the future, either).

# used in _now_is_time_to_refresh_trades
df['candle_end'] = df['candle_start'] + \
pd.Timedelta(minutes=timeframe_minutes)
df.drop(columns=['datetime'], inplace=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use timeframe_to_next_date() instead.
You're assuming timeframe in minutes - which may not hold true on longer (or shorter) timeframes.

doing this (and the other comment) should allow us to simplify the code by removing _convert_timeframe_to_pandas_frequency which is faulty on bigger timeframes.

(_, timeframe_minutes) = _convert_timeframe_to_pandas_frequency(timeframe)
candle_next = candle_start + \
pd.Timedelta(minutes=timeframe_minutes)
# skip if there are no trades at next candle
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here - timeframe_to_next_date() will be your friend.

Comment on lines 531 to 538
_candle_type = CandleType.from_string(
candle_type) if candle_type != '' else self._config['candle_type_def']
data_handler = get_datahandler(
self._config['datadir'], data_format=self._config['dataformat_trades'])
ticks = data_handler.trades_load(pair)
trades_df = public_trades_to_dataframe(
ticks.values.tolist(), pair=pair)
return trades_df
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes - but the point is - dataprovider is available to the stratgy via self.dp.xxx - so - if we assume self.dp.trades() is used within a callback (for whatever reason) - then this should have "lookahead bias protection" for backtesting (truncate data at "current date") - otherwise i can use a callback (say, confirm_trade_entry()) and reject all trades where the price in 1h is not above current price - though in live, that's info i would never be able to have.

If we don't wanna do that for now - i'm fine with a small hint in the docstring saying "this is not meant to be used in callbacks" or similar - to alert users that this is potentially problematic.

Comment on lines 2259 to 2263
[ticks_pair, new_ticks] = self._download_trades_history(pair,
since=since_ms if since_ms else first_candle_ms, # noqa
until=until,
from_id=from_id)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

Comment on lines +2439 to +2443
df = self.klines((pair, timeframe, candle_type), True)
_calculate_ohlcv_candle_start_and_end(df, timeframe)
timeframe_to_seconds(timeframe)
plr = round(df.iloc[-1]["candle_end"].timestamp())
now = int(timeframe_to_prev_date(timeframe).timestamp())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you reverse this logic - it can be simplified quite significantly (including the removal of the "copy" argument on klines()).

simply getting the last candle date (df.iloc[-1]['date']) - and then adding 1x timeframe to it will suffice.
_calculate_ohlcv_candle_start_and_end() may be useful - but it's doing the calculation on 1000 candles - while we only need it on the most recent one.

freqtrade/exchange/exchange.py Outdated Show resolved Hide resolved
freqtrade/exchange/exchange.py Outdated Show resolved Hide resolved
else:
until = int(timeframe_to_prev_date(timeframe).timestamp()) * 1000
all_stored_ticks_df = data_handler.trades_load(f"{pair}-cached")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason to use a "-cached" filename here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to avoid filename collisions. You think we don't need it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well i don't know - what's the reason to have this "duplicated"?

like - is there a problem we expect by appending to the already existing files?
Sure, it could delay initial startup (avoid holes in the data) ...

But on the other hand, assume someone would like to backtest over a period of a year - and has a dry-run running at the same time ...
at the end, they'd have the same exact data twice - once in -cached files, once without.

"stacked_imbalance_range": 3, # needs at least this amount of imblance next to each other
"imbalance_volume": 1, # filters out below
"imbalance_ratio": 300 # filters out ratio lower than
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's a ratio - isn't 300 a bit excessive? (like - a ratio should go from 0 to 1 (which means 100%)) ?

Comment on lines +477 to +480
'scale': {'type': 'number', 'minimum': 0.0},
'stacked_imbalance_range': {'type': 'number'},
'imbalance_volume': {'type': 'number'},
'imbalance_ratio': {'type': 'number'},
Copy link
Member

@xmatthias xmatthias Mar 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TheJoeSchr I'd apreciate if you could have a look at these
i assume we're able to define ranges for some of these ... though from the docs, the actual range wasn't clear to me - and i (intentionally) didn't want to discover it by reading the code.

Also, i ASSUME most are mandatory .. though i'm not entirely sur, so i'd rather leave that to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Enhancements to the bot. Get lower priority than bugs by default.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Volume Profile and Cluster Search as freqtrade feature
10 participants