allows developers and programmers to provide stakeholders with a clearer understanding of what questions may reasonably be asked of a dataset with very little programming effort. How much data is actually present in every row or "what are the unique, or most common values in this column" are some basic questions can help shave up to 30% of the data science workflow experience off according to some random source on the internet, and from my perspective is just an essential first step, period.
Carnegie Mellon has a deep-dive chapter on the subject
and here's a brief and reasonably concise overview https://www.svds.com/value-exploratory-data-analysis/
Pandas profiling and Sweetviz are simple installs that work well with Streamlit,
To test you can set up a streamlit share and then install
here's some python code wrapped in streamlit that provides both for you to test with a CSV of your choosing
--- i pulled most of this from the video here...
https://www.youtube.com/watch?v=zWiliqjyPlQ - this video goes in depth - i skipped to about minute 30 to get in to the sweetviz stuff and then headed over to the Github repo
to use this file as a basis for an even more stripped down version seen above
Claude Moore Health Sciences Library
1350 Jefferson Park Avenue P.O. Box 800722
Charlottesville, VA 22908 (Directions)
Contact UsStaff Directory(434) 924-5444Feedback
© 2023 by the Rector and Visitors of the University of VirginiaCopyright & Privacy