During the past couple of months I’ve read Emma Wedekind‘s article 101 Tips For Being A Great Programmer (& Human) several times. It is always good to be reminded of what your profession is all about and how to integrate it into the larger picture – becoming the best human being you can be. It is one awesome, inspirational article. Since lately I’ve been part of both worlds, software development world and data science world (yes, in my opinion you can be both, check out this article), I wanted to do the similar thing for data scientists out there.
Of course, this is just a small subset of skills you need to develop in order to become successful data scientist. However, it is a start. They are not in any particular order, so if you have anything to add, feel free to leave a comment. Hopefully, this article will be an inspiration to you as much Emma’s article was to me. Onward!
Clean Code
Yes, coding is here to stay, at least for a little while. Learn how to do it properly. Name your variables and functions like a grown up, add comments and pay attention to the structure. The field is growing and sooner or later you will work in a team. Clean code helps all members of the team be in sync and on the same page.
Scraping
Internet is full of data. In fact, data science as a field is “crossing the chasm” because of that. Learn how to gather the data that you need to do your job. There are many libraries out there that can help you with this. Since I work with Python, my personal favorite is Scrapy.
Databases
You need to store data somewhere. Relation databases can help you with structured data. You should probably know how to normalize database, ie. learning normal forms in order to improve data integrity and reduce data redundancy should be a must. With that learning SQL language is probably the best long term skill you can acquire, since it has been around forever.
If you are dealing with unstructured data NoSQL databases can help you with that. Depending of the nature of the data you can benefit from different NoSQL database – MongoDB, Cassandra, etc.
Calm Cheerfulness
As in every other profession, you and your team will have obstacles while trying to achieve your goals. Deadlines are going to be missed, Collab will not be available and your colleague will not understand the tasks in the right way. You will make mistakes and job will not be a breeze. It is easy to be a hothead in those particular moments. Try to be calm and insert calmness into others instead. This will help you and your team to go around obstacles and win.
Learn old Principles
There are a lot of innovations happening in our field. Good part of those innovations and ideas have root in the principles that were invented in the last century. Data Science is like some kind of steam-punk science, everything is somehow new and yet somehow old. Ancient concepts are still around even after long long time. Bayes’ theorem, Markov’s Chains, Cross-entropy, you name it. Find those principles and learn them. They are here to stay.
Learn new Technologies
When you know old concepts, it should be easy to apply them in new technologies, right? Technically, yes. However, in order to do so, you first have to know which are those new technologies and have at least some basic knowledge about them. The lecture is of course – Never stop learning and stay current. Old principles in combination with new skills are always in demand, regardless of profession.
Build your toolbox
There is a bunch of technologies you can pick from for any given problem. Pick the ones that will be your default. These technologies don’t have to be the best for every problem, but they should be best for you. Meaning, that using these technologies, you are able to try out new ideas and mechanisms quickly.
My favorite combo is Python, TensorFlow and Jupyter Notebook. Find out which technologies are the best for you, practice new lectures using them firs and then apply them in some other technology.
Take Brakes
Learn how to organize your time in order to be most efficient. Today, there are many productivity tips out there. Find your tempo and don’t get overwhelmed. Avoid over-burn at all costs. Personally, I like to use Pomodoro technique.
Data Visualization
Every once in a while you will have to present your results to the clients and stakeholders. There will be charts, and graphs, and bunch of visual represented data. Learn how to create visually stunning graphs. Apart from that, find out how each color affect human psyche and which color should be used for which type of data. This way it will be easier for your clients to understand the points you are trying to make.
Take Vacation
As I mentioned previously, avoid over-burn at all costs. It may take you months to recover from it and it’s gory effects. Take vacation. Don’t let management, bosses or clients tell you otherwise. Take care of yourself.
Work-Life Balance
Another take care advice. Job is important. But so are the family, hobbies and leisure. Don’t work yourself to death.
Read & Write Documentation
Before you start writing code using some open-source library read the documentation. You will save precious time and numerous failed tests. Take your time and don’t rush. Also, write documentation for your projects. Your team will be aligned and always on the same page thanks to that.
Create Great Presentations
Power Point, Keynote or Google Slide should be your best friend. Learn how to use these tools. If you have a nice presentation for your client or stakeholder, your goals will be easier to achieve. Make presentations for your team’s briefings and team will know where to go and how to execute the plan.
Speak the Language of the Business
If you have a feeling that management and stakeholders are not really impressed by your cool new neural net architecture, you are probably right. Business is driven by value, and technology is just a means of achieving it. Learn how to translate your math and development vocabulary into something that business will listen to and value.
Understand the Why
Sometimes our clients want AI or Machine Learning solution just because of the hype. Often they don’t need it, they just need web application. That is why we provide both software development and data science services 🙂 Learn how to speak to the clients and find out why do they need solution for. What problem exactly are they trying to solve? Then structure that solution in the right direction and explain reasons to the client. Don’t try to impress them with fancy and expensive, but ultimately useless solutions. Otherwise, you will have just a bunch of failed projects and bad references.
Look Through the Requirements
As you may notice, we pay special attention to relationship with our clients. In order to get to the mentioned why, sometimes you need to look through the requirements. Clients will unintentionally create the forest and you will not be able to see the trees. Analyze requirements in detail and ask a lot of questions. That is the only way to get to the real reason and to provide best service.
Exploratory Data Analysis
Learn how to explore the data and create new features from existing ones. Apart from that, learn how to observe data from different angles. This is probably one of the most important steps when building machine learning and deep learning solutions.
Team is Larger Than the Individual
NEVER. FORGET. THIS. No matter how good you are, how many years of experience under your belt you have, you cannot achieve as much as one team. Learn how to be a team player, help others and have a good communication. Team sports are hard, but necessary for big achievements.
Basic Machine Learning Algorithms
Having a knowledge on how these algorithm work in detail is one of the greatest skills you can have in your toolbox. In fact, you can start solving many problems by applying these techniques.
Do the Math
The majority of concepts used in data science were invented 50 or more years ago. In fact, all the ideas were based heavily on math. This is the most troubling part for the people who are trying to get into the field. The most common question that I get at meetups and conferences is: “How much math should I know?”. Those were all people with software development background trying to get into the data science world. Actually, this was the question that I asked myself long ago when I started my journey through this universe. Data science is all about math. Brush off the dust from old university books and learn everything you can.
Prioritize and Execute
Some tasks are more important then others. In order to get momentum running and finish project on time, learn how to plan and prioritize tasks. Once you have a solid plan, execute tasks one at the time.
Discipline is Freedom
Freedom is one of the main values we try to nourish. While talking with people about what freedom means to them, I’ve noticed that everyone has a different perspective on this topic. However, no matter how different their understanding of the word freedom was from mine, we could agree that discipline is the one that will get you there. This might sound weird, but once you prioritized the tasks and started working on them, you are already exercising freedom. Being able to choose your tasks on and then actually doing it is the definition of freedom.
Take Ownership
You can always blame weather, fate or your daily horoscope for your failures. That is the easy path. In order to make any kind of meaningful progress you need to make mistakes and learn from them. If you don’t take ownership for your mistakes (and your life) you will miss precious learning opportunity.
Image Processing
Screens are getting bigger and bigger. Phones are getting better and better. Also, Virtual Reality and Augmented Reality are on the rise so we may expect a lot of projects coming from those industries. Make sure that you follow this trend. It will help your career tremendously.
Deep Learning
We are all still skeptical about neural nets and aware of their limitations. However, they became industry standard and they still can solve one big chunk of problems we have. They are also really popular today, so it is always good to know more about them.
One Thing at a Time
Don’t try to do everything at once, or to be everything at once. Structure your project and go one step at the time. Have big picture in mind, but don’t let it overwhelm it. Solve what is in front of you and progress in increments. It takes patience and self-control, but that is the beauty of it.
Share Knowledge
“If you can’t explain it simply, you don’t understand it well enough“. Only when you start explaining some concept to your peer, you will realize how well you understood it. It is good for you and it is good for the people you are mentoring. Apart from that, there are several ways of sharing knowledge: give talks, mentor, write blog posts, etc. You don’t have to do them all, pick the one that suits you the most. If you shy from getting on the stage, you can always visit your local events and support them. After you can write a blog about it.
Morning Routine
Developing a morning routine is probably the best productivity tip I can give you. Mine routine is consisting of several actions:
- write a diary
- exercise
- take a cold shower
- 30 minutes walk to the office with audio book or podcast on my headphones.
Find out what works for you and stick to it.
Kill the ego
Realize that it is not about you. It is about the team you are a part of, your project and your community. Create a cake so big that everyone can have a piece. Be kind and stay humble.
Trust the Process
Trust the Process is a mantra used by fans of the NBA’s Philadelphia 76ers, during their rough patch. The term was coined by Sam Hinkie and today is popular elsewhere in sports and culture. It refers to the situation in which you should temporary forget the big picture and focus on the task at hand. Execute or learn what is in front of you and incrementally you will become better data scientist and better human being. Trust your plan and trust your process.
Conclusion
Well, those are just some of the skills that you need to develop in order to become better data scientist and better human being. Hope this will inspire you to work on these skills and then some more. What do you think? Any special skills you might want to add? Leave a comment!
Thank you for reading!
Read more posts from the author at Rubik’s Code.
Trackbacks/Pingbacks