Category: Technology



It's common for data scientists to narrowly focus on the APIs of the tools they use every day—pandas, matplotlib, pymc, dask, &c.—to the detriment of any focus on the surrounding programming language. In the case of tools like matplotlib, the total amount of Python we need to know is limited to what existed when matplotlib was first developed. (Did you know that matplotlib predates @property? That explains a lot…) In the case of newer tools like dask or pymc or even pandas, we may encounter some newer parts of Python—e.g., context managers or descriptors—as part of these tools' API design, but it's very easy to accept these as mere “syntax.” In this talk, we will discuss where a deeper understanding of pure Python has direct and immediate consequences to your work as a data scientist. We will discuss where these parts of Python you may have skimmed over show up in analytical code, outside of the mere “syntax” of an API. This talk will be organised around answering the following questions:
– why do generators even matter (and who cares about coroutines)?
– the itertools module is great… if I were writing scripts, but where does it show up in data analysis?
– object orientation seems like a bunch of bureaucracy—can it really simplify my analytical code?
– why should I bother with data types in builtins and collections; is the pandas.DataFrame not enough?
– knowledge of Python internals would probably be useful, if I were a programmer writing scripts, but why do they matter for a data scientist? Bio:
James Powell
James Powell is the founder and lead instructor at Don’t Use This Code. A professional Python programmer and enthusiast, James got his start with the language by building reporting and analysis systems for proprietary trading offices; now, he uses his experience as a consultant for those building data engineering and scientific computing platforms for a wide range of clients using cutting-edge open source tools like Python and React. He also currently serves as a Board Director, Chair, and Vice President at NumFOCUS, the 501©3 non-profit that supports all the major tools in the Python data analysis ecosystem (i.e., pandas, numpy, jupyter, matplotlib). At NumFOCUS, he helps build global open source communities for data scientists, data engineers, and business analysts. He helps NumFOCUS run the PyData conference series and has sat on speaker selection and organizing committees for 18 conferences. James is also a prolific speaker: since 2013, he has given over seventy (70) conference talks at over fifty (50) Python events worldwide. PUBLICATION PERMISSIONS:
PyData Organizer provided Coding Tech with the permission to republish PyData tech talks. CREDITS:
PyData YouTube channel: https://www.youtube.com/@PyDataTV https://www.youtube.com/watch?v=-Y0eTPoMjVk



In this tutorial, Marcus Hellberg shows how to build a full-stack todo app with a Spring Boot Java backend and a React Typescript frontend connected with the new Hilla framework. Create the app with:
npx @hilla/cli init –react todo 0:00 – Intro
0:18 – Creating a Hilla Spring Boot + React project
1:08 – Adding database dependencies
1:50 – Starting the dev server (Spring Boot and Vite)
2:37 – Creating a Todo View
4:32 – Creating the backend data model
6:40 – Creating an Endpoint server API
9:20 – Fetching todos from the server
12:38 – Adding new todos: input and server call
16:57 – Listing todos
18:03 – Updating todos
20:55 – Outro PUBLICATION PERMISSIONS:
Original video was published with the Creative Commons Attribution license (reuse allowed). Link: https://www.youtube.com/watch?v=nIWshZR_wtU https://www.youtube.com/watch?v=ssw1ANCV1h8



The outline of the talk goes as follows: -We will make a very brief introduction to pandas, we will talk about its importance, and we will point out some of its shortcomings (as its own creator did half a decade ago (10 minutes)
-We will enumerate some of the current pandas alternatives and classify them (pandas-like vs bespoke, single-node vs distributed) (5 minutes)
-We will do a live demo of how to analyze and manipulate a relatively big dataset using Polars inside Orchest Cloud y and showcase some of its unique capabilities (20 minutes).
-Recommendations and conclusions (5 minutes). After the talk, you will have more information on how some of the modern alternatives to pandas fit into the ecosystem, and will understand why Polars is so exciting and promising. Prior exposure to data manipulation with Python (not necessarily with pandas) will help make the most of the presentation. The talk will build upon this blog post about Polars. PUBLICATION PERMISSIONS:
PyData Organizer provided Coding Tech with the permission to republish PyData talks. CREDITS:
PyData YouTube channel: https://www.youtube.com/@PyDataTV https://www.youtube.com/watch?v=R0Y3LtZuUy8