US Bikeshare Data Analysis
Built for the Udacity–Google Palestine Launchpad Data Science Nanodegree (PDSND). Interactive terminal app for exploring Chicago, NYC, and Washington bikeshare data with flexible time filters and rich stats.
Overview.
Developed as part of the Udacity–Google Palestine Launchpad Data Science Nanodegree (PDSND), this command-line app analyzes US bikeshare data for Chicago, New York City, and Washington. Users choose a city and optional time filters (month/day/both/none), then the program prints concise insights: busiest times, popular stations and routes, trip durations, and user demographics—plus an optional raw-rows viewer for quick audits.
Key features
- Interactive filters: guided prompts with validation for city, month, day, or both (or none to view all data).
- Time stats: most common month, day of week, and start hour for trips.
- Station stats: top Start Station, End Station, and most frequent Start → End route.
- Trip stats: total and mean travel time (human-readable via
convert_seconds()). - User stats: counts by User Type and, when available, Gender and Birth Year summaries.
- Raw data viewer: opt-in batches of 5 rows using a CSV streaming helper.
- Extra utility:
get_most_common_season(city)infers the busiest season from monthly usage.
Tech stack
Python, Pandas, NumPy, csv (stdlib), time module
Architecture (simplified)
- Ingest: load the selected city CSV into a DataFrame; parse
Start Timetodatetime. - Feature derivation: add
month,day_of_week, andhourfeatures. - Filtering: subset the DataFrame by chosen month and/or day.
- Computation:
- Time stats via
mode()onmonth,day_of_week, andhour. - Station stats including a composed
Start Station + " To " + End Stationroute key. - Trip duration via
sum()andmean(); humanize withconvert_seconds(). - User stats via
value_counts()with defensive checks for optional columns.
- Time stats via
- Inspection (optional): stream raw rows in groups of 5 using
DictReader.csv.
Links
- Code repository: pdsnd_github (GitHub)
- Program: Udacity – Data Science Nanodegree
- Libraries: Pandas · NumPy · csv