DiscoverThe Data Scientist Show
Claim Ownership
The Data Scientist Show
Author: Daliana Liu
Subscribed: 271Played: 2,936Subscribe
Share
© Daliana Liu
Description
A deep dive into data scientists' day-to-day work, tools and models they use, how they tackle problems, and their career journeys. This podcast helps you grow a successful career in data science. Listening to an episode is like having lunch with an experienced mentor. Guests are data science practitioners from various industries, AI researchers, economists, and CTOs of AI companies. Host: Daliana Liu, an ex-Amazon senior data scientist with 180k followers on Linkedin.
Join 20k subscribers at www.dalianaliu.com to learn more about data science, career, and this show. Twitter @DalianaLiu.
Join 20k subscribers at www.dalianaliu.com to learn more about data science, career, and this show. Twitter @DalianaLiu.
89 Episodes
Reverse
Daliana interviewed 6 data scientists from her meetup in New York City. It's a unique episode where you get to hear the real frustrations of data scientists. We talked about struggles working in healthcare, finance, data quality and AI, how to advocate for yourself, and align with your managers.
Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
Most experimentations fail, Kristi Angel shares her expertise on scaling experimentation and avoiding common A/B testing pitfalls. Learn five things that can help boost test velocity, designing impactful experiments, and leveraging knowledge repos. (Chapters below)
Kristi Angel’s LinkedIn: https://www.linkedin.com/in/kristiangel/
Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
(00:00:00) Intro
(00:01:26) Why do most experimentations fail?
(00:07:05) Mistakes in choosing metrics
(00:10:05) Is revenue a good metric?
(00:13:18) Split metrics in three ways
(00:15:10) Daliana's story with too many category breakdowns
(00:16:59) What makes the best data science team?
(00:19:24) Data scientist work in silo vs in a data science team
(00:21:15) Building a knowledge center
(00:23:40) Example of knowledge center; nuance of experimentations
(00:26:09) How many metrics and variants?
(00:30:56) How to reduce noise - CUPED
(00:33:01) Future of A/B testing
(00:38:33) Q&A: Low statistical power
Julia Silge is an engineering manager at Posit PBC, formerly know as R-studio, where she leads a team of developers building open source software MLOps. Before Posit, she finished a PhD in astrophysics, worked for several years in the nonprofit space, and was a data scientist at Stack Overflow where some of her most public work involved the annual developer survey. We talked about MLOps tools, challenges in survey data, text analysis, and balancing her interests in data science and engineering.
Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
(00:00:00) Introduction
(00:00:56) Getting into data science
(00:04:50) Transition from data centers to engineering manager
(00:14:04) Common challenges in tool development
(00:17:38) Challenges with survey data
(00:26:47) Engineering skills for data scientists
(00:28:59) Balancing roles
(00:34:49) Developing skills in Exploratory Data Analysis (EDA)
(00:39:19) Python vs. R for data analysis
(00:44:40) Exciting aspects in career and personal life
Wes McKinney is the co-creator of pandas library and he is the cofounder of Voltron data. Currently he is a principal Architect at Posit and an investor in data systems.
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
Wes' LinkedIn: https://www.linkedin.com/in/wesmckinn/
(00:00:00) Introduction
(00:00:44) How Pandas Started
(00:06:40) Voltron Data
(00:10:03) Benefits of Easy-to-Use Data Tools
(00:13:20) The Rise of New Data Tools
(00:18:07) Choosing Tools: Vertical or Flexible?
(00:23:01) Big Models and Data Tools
(00:29:29) Challenges in Building a Product
(00:31:28) Becoming a Top Architect
(00:34:55) Missed Aspects of Previous Roles
(00:39:04) A Busy Week: Advising, Designing, Investing
(00:43:42) Improving Open Source
(00:45:24) How to Decide What to Work On
(00:46:28) What he’s learning now
(00:47:56) Excitement in Career and Life
(00:48:29) Using ChatGPT for Learning
(00:50:27) Future Impact Goals
Christopher Fricker is a senior director in analytics and BI at Renaissance Learning. He started his career in finance and later became a data science consultant working with Meta, Netflix, and pre-IPO tech companies doing analytics. We talked about the mental models that helped him grow from a finance analyst to an analytics leader.
Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Chris’ LinkedIn: https://www.linkedin.com/in/christopherfricker/
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
(00:00:00) Introduction
(00:01:46) How to get promoted quickly
(00:08:40) Power vs authority
(00:11:21) First principal thinking
(00:32:34) ROI of a data team
(00:40:29) How to be persuasive
(00:54:52) All Data is wrong
(00:56:22) How he audits the data
(01:00:52) How to make someone help you at work
I interviewed Geoffery Angus, ML team lead @Predibase to talk about why adapter-based training is a game changer. We started with an overview of fine-tuning and then discussed five reasons why adapters are the future of LLMs. Later we also shared a demo and answered questions from the live audience. Try fine-tuning for free: https://pbase.ai/GetStarted
Geoffrey’s LinkedIn:https://www.linkedin.com/in/geoffreyangus
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
Geoffrey’s LinkedIn: https://www.linkedin.com/in/geoffreyangus
Try finetuning for free: https://pbase.ai/GetStarted
(00:00:00) Intro
(00:01:19) What is Fine-tuning?
(00:08:18) Utilizing Adapters for Finetuning Enhancement
(00:09:50) 5 reasons why adapters are the future of LLMs
(00:26:34) Common Mistakes in Adapters Usage
(00:28:34) Training Your Own Adapter
(00:32:23) Behind the Scenes of the Adapter Training Process
(00:37:51) Config File Guidance for Fine-Tuning
(00:39:41) Debugging Strategies for Suboptimal Fine-Tuning Results
(00:42:23) User Queries: Creating a LoRa Adapter and Future Support
(00:51:06) Key Takeaways and Recap
Jay Feng created a viral project using Seattle crime data and later got into data science. He later founded "Interview Query" helping data scientists get jobs. We'll talk about how he landed his data science job through his blog, and his journey from data scientist to founder. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
Jay Feng's LinkedIn: https://www.linkedin.com/in/jay-feng-ab66b049/
Jay Feng's YouTube: https://www.youtube.com/c/DataScienceJay
(00:00:00) Introduction
(00:01:11) From engineer to data scientist
(00:03:10) Got a job through a project
(00:05:35) Daliana's portfolio project with Zillow
(00:09:13) From data scientist to entreprenuer
(00:13:19) "Tinder" for job
(00:15:01) How he chose companies to work for
(00:15:56) Why he became an entreprenuer
(00:17:37) How many hours does he work
(00:18:54) Challenges when building "interview query"
(00:20:18) Speed vs scale
(00:22:11) Growth hacks he used
(00:24:22) YouTube vs newsletter
(00:27:21) Lessons he learned as a CEO
(00:29:16) How to grow from tech employee to founder
(00:31:59) How he defines success
(00:34:38) If you have a business idea for Jay
Erik Gafni builds AI systems and teams. He founded Eventum AI (https://bit.ly/eventum-ai), an ML consulting company working with high-growth startups. We talked about GenAI projects he worked on, how he built production ML systems, how to scale ML teams, and his journey from biologist to ML researcher.
Interested in working with Erik: https://bit.ly/erik-consulting
Erik's LinkedIn: https://bit.ly/erik-gafni-LI
(00:00:00) Introduction
(00:01:59) Is GenAI overhyped?
(00:04:28) Ascent translation with AI
(00:11:58) Social media app with AI
(00:14:00) Stable diffusion model evaluation
(00:15:57) "Consult-to-hire" model
(00:17:35) AI in biotech
(00:22:46) Self-supervised learning
(00:31:22) How he hires people
(00:33:19) Research vs production
(00:35:57) Is AGI coming?
(00:37:30) New trends in GenAI
(00:41:45) Data quality in GenAI
(00:42:58) Philosophy in LLMs
(00:49:48) OpenAI vs Open Source
(00:53:58) Mistakes he made
(00:57:41) How did he get into ML
Jay Feng is the CEO of interview query, a service that help data scientists get jobs. Previously he worked as a data scientist at Nextdoor, Monster. We talked about data science job market, the rise of AI engineering, and the softskills people overlook during interviews. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
Jay Feng's LinkedIn: https://www.linkedin.com/in/jay-feng-ab66b049/
Jay Feng's YouTube: https://www.youtube.com/c/DataScienceJay
00:00:00 Introduction
00:01:11 Data science job market in 2024
00:09:13 Build projects with AI
00:16:19 Softskills in interviews
00:23:18 Daliana's story on "socializing ideas"
00:28:38 Common mistakes in interviews
00:35:30 Product DS vs ML interviews
00:36:27 Product analytics interview questions
00:39:18 Career transition in DS
00:43:04 Jay's career journey
00:45:38 Is there a principal data analyst?
00:51:52 AI engineer
00:54:28 New roles vs obsolete roles in DS
01:04:46 Is data science dead?
We are joined by two data scientists who have firsthand experience with layoffs. We’ll talk about how to negotiate severance packages, how to handle stress, strategies for job hunting post-layoff, and how to reduce risks in full-time employment.
Working with Daliana on personal branding: https://forms.gle/heNuZzaHjaAMQwLu6
Her email: daliana@dalianaliu.com
Guests:
Susan Shu Chang:
Linkedin: https://www.linkedin.com/in/susan-shu-chang/
Newsletter: susanshu.substack.com
Sundar Swaminathan
Linkedin: https://www.linkedin.com/in/sswamina3/
Website: https://www.sundarswaminathan.com/
(00:00:00) Introduction
(00:06:13) Severance Negotiation
(00:20:29) Identity crisis
(00:26:22) Job search after layoff
(00:30:21) Networking
(00:35:23) Risk at pre-seed startups
(00:37:03) How should data scientists pick companies
(00:40:43) What to ask hiring managers
(00:45:01) Does GenAI change interview processes?
(00:47:17) Are data science teams getting leaner?
(00:48:56) Future of data science roles
(00:50:37) Full time employment and job security
(00:53:46) Benefits of full time jobs
(00:58:14) Reduce risk of being laid off
(01:00:43) How to sell yourself
(01:02:43) How to plan your finances
(01:05:09) How to become an independent consultant
Jenny Wu is a data analyst turned sales engineer for data products at Hex. We talked about sales engineer vs data analyst, how to design a career based on your personality, and how to transition into a customer-facing role.
Jenny’s LinkedIn: https://www.linkedin.com/in/jenny-wu-...
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
(00:00:00) Introduction
(00:01:34) What is a Sales Engineer?
(00:09:35) Sales Engineering Day-to-Day
(00:13:09) Challenge in sales
(00:21:37) Traits of Successful Salespeople
(00:30:32) Stakeholder Engagement
(00:36:24) Getting into customer-facing roles
(00:43:55) Quitting her job to travel the world
(00:48:05) Advice on Career Breaks
(00:50:39) Embedding Career and Personal Goals
(00:51:57) How do you achieve happiness?
Barry McCardel is the cofounder and CEO of Hex(free trial: hex.tech/dsshow), a collaborative data workspace. Their customers include FiveTran, Notion, and Anthropic. We talked about what does the future of data team look like, how to tackle challenges of data team collaborations, and how to leverage AI in data science’s workflow.
60-day Free Trial: hex.tech/dsshow
Barry’s LinkedIn: https://www.linkedin.com/in/barrymccardel
(00:00:00) Introduction
(00:01:25) Is AI replacing data scientists?
(00:06:08) Are data science teams getting smaller?
(00:09:54) What is Hex?
(00:11:24) How to communicate with stakeholders
(00:24:29) Should data scientists be full stack?
(00:31:23) How data team measure ROI
(00:33:35) Quantitative vs qualitative analysis
(00:35:33) When you shouldn't use data? Data vs product intuition
(00:41:39) How to hire your first data team?
(00:48:59) Is the modern data stack dead?
(00:53:55) GenAI in data science workflows
(00:59:03) Future of data scientist
(01:02:30) New features in Hex
Siddhartha Sharan is a Senior Data and Applied Scientist at Microsoft, helping product teams make data-driven decisions. Currently he is working on an AI product built with OpenAI APIs for sentiment analysis. We talked about how he evaluates AI products built with large language models at Microsoft, product data science, and how he went from a business background to data science. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Sid’s LinkedIn: https://www.linkedin.com/in/siddharthasharan/
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
(00:00:00) Introduction(00:05:20) How does Microsoft evaluate AI product(00:16:17) Using OpenAI API for sentiment analysis(00:25:29) Microsoft data science team culture(00:26:52) DS, PM collaboration(00:28:29) Three steps to build trust in data science(00:30:13) How did he got into Microsoft(00:34:09) Level up in Genetech(00:36:09) ML engineer vs Product DS(00:37:43) Core skills in product DS(00:40:20) Hiring(00:42:47) How to deal with burnout(00:45:03) Should you over work to earn trust?(00:45:44) Daliana's story about first day at Amazon(00:49:54) Will AI replace data scientists?(00:51:32) Data scientist's role of GenAI(00:54:32) How to keep up with GenAI
Jess Ramos is a Senior Data Analyst at Crunchbase, a LinkedIn Learning Instructor, and a content creator in the data space. She has a bachelor's degree in Math, Spanish, and Business from Berry University and a master's in Business Analytics from University of Georgia. Today we’ll talk about SQL in the real world, data analyst vs data scientist, is job hopping bad, how she negotiated her salary. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Jess’ Linkedin: https://www.linkedin.com/in/jessramosmsba/
Preparing to Get a Job in Data Analytics: shorturl.at/sCNPT
Solve Real-World Data Problems with SQL: https://bit.ly/3Zq6wnd
Big Data Energy Newsletter: https://bit.ly/46x4rIR
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
(00:00:00) Introduction
(00:01:24) Why Jess left her job at Freddie Mac
(00:03:25) Is job hopping bad
(00:04:42) How to explain short job stints when interviewing
(00:06:49) Jess's day-to-day work and tech stack
(00:09:15) SQL in the real world
(00:12:10) How to talk data to stakeholders
(00:18:33) How Jess prepares for SQL interviews
(00:28:11) Data analysts vs data scientists
(00:32:11) Choosing a career path
(00:47:19) How to ask recruiter questions
(00:50:15) Jess's LinkedIn content creation journey
(00:59:03) The future of Jess's career
(01:03:42) Jess's favorite books
Mehdi Noori is an applied science manager at the Generative AI Innovation Center at Amazon. I used to work with Mehdi while we were at the Machine Learning Solutions Lab at AWS. So before Amazon, Mehdi was a data scientist working on marketing intelligence. Mehdi has a PhD from University of Central Florida in civil engineering and sustainability. Subscribe to Daliana's newsletter for more on data science and career www.dalianaliu.com
Mehdi Noori: https://www.linkedin.com/in/mehdi-noori/
Predicting Soccer Goals: https://aws.amazon.com/blogs/machine-learning/predicting-soccer-goals-in-near-real-time-using-computer-vision/
My friend Misty moved to a farm in Portugal after her 20 years of career in finance. We talked about her experience moving from the busy corporate life to the farm life where she does a lot of manual work. Was it challenging, how does her finance work, and what is her advice to other people who also want to explore a different path outside of the modern city life. I hope this episode will give you a different perspective about your career.
Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
(00:00:00) Introduction
(00:11:41) Life on the farm
(00:15:46) Her finance plans
(00:22:55) Her career journey
(00:27:14) What do accountants do
(00:32:29) I thought I would be happy
(00:41:25) Daliana's personal view about finance; when it's enough for you
(00:44:41) Does she feel lonely on a farm?
(00:48:39) What if she didn't leave the corporate world?
(00:54:07) Does she regret her decision
Pan Wu is a senior manager of data science at Meta. We talked about why he moved from machine learning to product data science, projects he worked on at Uber, Linkedin, and Meta, and how he transitioned from IC to manager. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Pan’s LinkedIn: https://www.linkedin.com/in/panwu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
(00:00:00) Introduction
(00:01:30) Why he transitioned from MLE to product DS
(00:07:38) Meta data scientists skill sets
(00:15:49) When did his interest shifted from MLE to product DS
(00:18:04) Is MLE more respected?
(00:25:46) A/B testing deep dives in 3 steps
(00:28:21) Built a tool at Linkedin
(00:35:52) How to sell your project
(00:41:07) Junior vs senior data scientist
(00:43:24) From staff data scientist to manager
(00:45:18) Explore being a manager
(00:46:24) Cultures in Uber, Linkedin, TrueCar
(00:52:09) Data science over the past 10 year
(00:55:06) MLE vs DS fun and frustration
(00:57:26) Product DS reality
(00:59:10) Learning new skills
(01:01:39) Mistakes he made
(01:06:34) Future of data science
(01:08:04) Will data scientists be replaced by AI
(01:09:42) Three skills he looks for when hiring
Betty Zhang is a data scientist currently working at a cloud security company, previously she was a data scientist at Amazon Web Services. Today we’ll talk about her computer vision projects in Sports, data science use cases in cyber security, from business major to data scientist, what’s her experience working in startups vs big tech companies. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Betty’s Linkedin: https://www.linkedin.com/in/betty-zhang-0bb63731/
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
(00:00:00) Introduction
(00:01:21) Computer Vision Project in Sports at AWS
(00:12:28) Challenges in computer vision
(00:14:02) Time allocation for ML projects
(00:15:22) 3 key skills for computer vision
(00:17:20) From business analyst to ML engineer
(00:18:14) How she got her data scientist job through Linkedin
(00:21:32) How she got into Amazon
(00:22:17) Three tech skills needed during Amazon interviews
(00:26:11) Why she joined a Cyber Security startup
(00:27:22) Three cybersecurity use cases
(00:29:47) Anomaly detection
(00:30:40) ML for cybersecurity
(00:34:43) Tech stacks Amazon vs Startups
(00:39:35) Startups vs big tech
(00:45:56) Balance learning and impact
(00:48:35) Advice for new data scientists
Che Sharma came back to discuss toxic behaviors in experimentation culture and provide actionable advice on how to handle those situations, how to have rigor and integrity when designing and analyzing A/B tests.
Che was the 4th data scientist at Airbnb, later he joined Webflow as an early employee. In 2021 he founded Eppo, a next-gen A/B experimentation platform designed for modern data and product teams to run more trustworthy and advanced experiments. We talked about A/B testing best practices, A/B testing for ML models, and Che’s career journey.
Reach out to Che: https://www.linkedin.com/in/chetanvsharma/
Jason Yosinski was a founding member of Uber AI Labs. He is also a co-founder of WinscapeAI a company dedicated to using custom sensor networks and machine learning to increase the efficiency and sustainability of wind farms. Jason holds a PhD in computer science from Cornell University. We talked about his experience at Uber AI, his research in deep learning, and ML for wind farms. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Jason’s Website: https://yosinski.com/
Jason’s LinkedIn: https://www.linkedin.com/in/jasonyosinski/
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu
(00:00:00) Introduction
(00:06:06) His advice for Uber ML teams
(00:16:03) From research to industry
(00:20:24) ML for wind farms
(00:25:40) Metrics for wind energy prediction
(00:29:23) Start with a small dataset
(00:32:00) ML in academia vs. the industry
(00:33:24) Do you need a PhD for ML?
(00:38:14) Daliana's story about grad school
(00:41:37) The value of a PhD
(00:43:13) ML Collective
(00:48:36) Technical communication
(00:57:21) ML Skillsets
(00:59:45) Future of machine learning
(01:05:23) Personal development: Hoffman process
(01:15:13) Do things that excites you
Comments
Top Podcasts
The Best New Comedy Podcast Right Now – June 2024The Best News Podcast Right Now – June 2024The Best New Business Podcast Right Now – June 2024The Best New Sports Podcast Right Now – June 2024The Best New True Crime Podcast Right Now – June 2024The Best New Joe Rogan Experience Podcast Right Now – June 20The Best New Dan Bongino Show Podcast Right Now – June 20The Best New Mark Levin Podcast – June 2024
United States