I have been interested for awhile now in selling goods through FBA. The idea is simple: Buy cheap products from Alibaba, have them shipped to an Amazon warehouse, create a listing on Amazon for the product, and profit. There are many articles and blogs on the internet on how to do this successfully. There are also data mining software packages that allow you to see what is being sold on Amazon. I thought is would be interesting to see what has been sold in the electronics department over the last month ending on 7/16/2018. I gathered this data from Jungle Scout and to my knowledge is data has not been studied before. The objective is two fold:
1. Is it worth it to sell electronics on Amazon?
2. Can data science be used to determine what the best product is to be sold on Amazon?
One of the main goals of a data scientist is to look at a dataset and determine what value can be extracted from it. This is done though conducting exploratory data analysis and looking for correlations in the data.
import pandas as pd
import os
import glob
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
currdir = os.listdir()
We have all of the data in the same folder as our Jupyter Notebook so we will load in the data and look at a few rows of the data.
path = 'Jungle Scout CSV Export - Mon Jul 16 2018 18_08_38 GMT-0500 (Central Daylight Time).csv'
df = pd.read_csv(path, skiprows=2, index_col=False)
df.head()
We have several csv files that need to be combined into one file so we will do a list comprehension to append the files into a single file:
df = []
for i in glob.glob("*.csv"):
data = pd.read_csv(i, skiprows=2, index_col=False)
df.append(data)
df = pd.concat(df)
df.describe()
The describe function looks at all continuous columns in the dataframe and gives statistics about the columns. The estimated monthly revenue can gives a positive outlook on the sales for the last month. The average sales were \$13,895. Not bad for a side gig. However, when looking at the 50th percentile we can see that the average sales are \$3,104 per month which indicates that the top sellers get the vast majority of the sales. The average product has 8 sellers competing to sell the product.
Next:
Cleaning the Data