Skip to content
@ydataai

YData

Accelerating AI with improved data

banner_ydata

YData.ai Medium LinkedIn Twitter Youtube Data-Centric AI Discord YData Profiling YData Synthetic YData Academy

Welcome to YData

Our mission is to help data science teams access and understand their data assets, and produce quality data to sucessfully deploy machine learning models.

We're the creators of YData Fabric, the first data-centric platform for data quality. We're also strong advocates of open source software and we're actively developing ydata-profiling, ydata-synthetic, and ydata-quality, three open source projects focused on producing high-quality data for machine learning applications.

You can stay up to date with the latest developments on our News or follow our Medium blog for hands-on tutorials on our open source packages.

We have a growing community of data scientists on our Discord Server, where we discuss emergent topics on Data Profiling, Data Labeling, and Synthetic Data. Join us to share feedback and discuss feature requests!

You can also find all about our montly events and data initiatives on our newsletter or reach us at developers@ydata.ai.

footer_ydata

Pinned

  1. ydata-profiling ydata-profiling Public

    1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

    Python 12.1k 1.6k

  2. ydata-synthetic ydata-synthetic Public

    Synthetic data generators for tabular and time-series data

    Jupyter Notebook 1.3k 218

  3. ydata-sdk ydata-sdk Public

    Public SDK to interact with the platform, either public or private

    Python 8 3

  4. academy academy Public

    Tutorials for YData's Fabric platform

    Jupyter Notebook 27 7

  5. ydata-talkdatatome ydata-talkdatatome Public

    Make your dataset talk to you. The AI assistant for data preparation.

    Python 7 1

  6. sd-metrics sd-metrics Public

    A repository that collects different metrics evaluate the quality of synthetic data under the scope data democratization. The metrics evaluate the quality of the synthetic data under the following …

    2

Repositories

Showing 10 of 69 repositories