SQL Problem Solving UC Davis: Dataset

SQL Problem Solving Dataset

A few days ago, I decided to enroll in the Learn SQL Basics for Data Science Specialization from UC Davis, California, through Coursera to refresh and update my SQL knowledge. However, like many other students, I felt frustrated when I couldn’t find all the necessary materials to complete the practice exercises.

I’m talking about the SQL Problem Solving section, where we need a SQL dataset to practice with our DBMS preferred. Unfortunately, it seems someone forgot to upload the files correctly, and we only found an .XLSX file with books and information to import, but it wasn’t in the proper format.

The missing UC Davis SQL dataset issue

Many of us have tried reporting this problem directly through the class platform and forums, but no one has responded. I even sent an email through LinkedIn to the section professor without getting any reply 😩.

This situation left many students stuck and unable to progress through the course materials effectively.

The solution

Since I couldn’t find a solution, I took on the task of creating the SQL schema and import files for each table. I hope this helps everyone who might find this post online and is facing the same UC Davis SQL dataset problem.

The schema for creating tables

Note: The entire schema was generated using DBeaver DBMS

-- CourseraSecondPart.dates definition

CREATE TABLE `dates` (
  `ds` date DEFAULT NULL,
  `day_ahead_1` date DEFAULT NULL,
  `day_ago_1` date DEFAULT NULL,
  `day_ahead_7` date DEFAULT NULL,
  `day_ago_7` date DEFAULT NULL,
  `day_ahead_28` date DEFAULT NULL,
  `day_ago_28` date DEFAULT NULL
)

-- CourseraSecondPart.events definition

CREATE TABLE `events` (
  `id` varchar(255) DEFAULT NULL,
  `USER_ID` varchar(255) DEFAULT NULL,
  `SESSION_ID` varchar(255) DEFAULT NULL,
  `DEVICE_ID` varchar(255) DEFAULT NULL,
  `EVENT_NAME` varchar(255) DEFAULT NULL,
  `USER_PROPERTIES` text,
  `EVENT_PROPERTIES` text,
  `EVENT_TIME` timestamp NULL DEFAULT NULL,
  `CLIENT_EVENT_TIME` timestamp NULL DEFAULT NULL,
  `SERVER_UPLOAD_TIME` timestamp NULL DEFAULT CURRENT_TIMESTAMP
) 

-- CourseraSecondPart.item_field_history definition

CREATE TABLE `item_field_history` (
  `id` varchar(255) DEFAULT NULL,
  `record_id` varchar(255) DEFAULT NULL,
  `field_name` varchar(255) DEFAULT NULL,
  `previous_value` text,
  `new_value` text,
  `changed_timestamp` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
  `changed_by_user_id` varchar(255) DEFAULT NULL
) 

-- CourseraSecondPart.items definition

CREATE TABLE `items` (
  `id` varchar(255) NOT NULL,
  `category` varchar(255) DEFAULT NULL,
  `material` varchar(255) DEFAULT NULL,
  `display_name` varchar(255) DEFAULT NULL,
  `assembly_type` varchar(255) DEFAULT NULL,
  `availability` varchar(255) DEFAULT NULL,
  `list_price` float DEFAULT NULL,
  `is_test_item` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`)
) 

-- CourseraSecondPart.line_items definition

CREATE TABLE `line_items` (
  `id` varchar(255) DEFAULT NULL,
  `order_id` varchar(255) DEFAULT NULL,
  `item_id` varchar(255) DEFAULT NULL,
  `price` decimal(10,2) DEFAULT NULL
) 

-- CourseraSecondPart.orders definition

CREATE TABLE `orders` (
  `id` varchar(255) DEFAULT NULL,
  `user_id` varchar(255) DEFAULT NULL,
  `created_timestamp` timestamp NULL DEFAULT CURRENT_TIMESTAMP
) 

-- CourseraSecondPart.users definition

CREATE TABLE `users` (
  `id` varchar(255) NOT NULL,
  `email_address` varchar(255) DEFAULT NULL,
  `username` varchar(100) DEFAULT NULL,
  `locale` varchar(10) DEFAULT NULL,
  `created_timestamp` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
  `is_admin` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`)
) 

Download files to import the data

Implementation examples

Here are some practical examples you can try once you’ve imported the dataset:

-- GOAL: List all items within a category
SELECT  * FROM items
WHERE category = 'Nozzle'

-- GOAL: Count the number of order placed each day
SELECT 
	CAST(orders.created_timestamp AS DATE) AS order_created_date,
	COUNT(orders.id) AS number_of_orders
FROM orders
GROUP BY CAST(orders.created_timestamp AS DATE)

-- GOAL: List items ids, price and price category
SELECT items.id,
	   items.list_price,		
	   CASE WHEN items.list_price > 100 THEN '>$100'
	   		WHEN items.list_price > 25 THEN '>$25-$100'
	   		ELSE '<$25' END AS price_category
FROM items

If you found this dataset helpful, please leave a comment below. I’m also happy to help if you need modifications or require assistance with other parts of the course.

Feel free to reach out if you encounter any issues while importing the data or if you’d like additional practice materials for the specialization.

Remember that practice makes perfect when learning SQL, and having the right dataset is crucial for your learning journey.


Share your thoughts 😉

Leave a Reply

Your email address will not be published. Required fields are marked *