This post is intended for my friend, Liu.
A data scientist is a skilled professional who utilizes their expertise in statistics, mathematics, programming, and domain knowledge to extract insights and valuable knowledge from large and complex datasets. They employ various techniques, such as data mining, machine learning, and statistical modeling, to analyze data, uncover patterns, and generate meaningful insights that drive informed decision-making in organizations.
Bilibili offers numerous high-quality courses. Choose the one that best suits you and remember to never give up!
Relevant skills:
Programming languages: Python, SQL, Oracle, R, HTML, CSS, Angular, JavaScript etc.
Data visualization: Excel, Tableau, PowerBI etc.
Statistics knowledge: Probability Theory, Sampling Methods, Regression Analysis etc.
Machine learning: CNN, RNN, Transformer etc.
For beginner:
python code: recommend using Visual Studio Code + Miniconda, as it is convenient to manage virtual environments and use Jupyter Notebook.
Python for Visual Studio Code
Git: should have a basic understanding of Git usage for version control. Fork recommended.
SQL: for personal training, it might be challenging to use a large SQL database. Instead, you can use Python + SQLite.
To be continued