top of page

ABOUT THE DATA

About the data and techniques used: Text

The data used can be found here. The data I am using ends on the 1st of October 2021.

​

Since the data from the website is too large and filtering it with python can cause errors, I manually deleted the columns that were not necessary for this project. After that, I then used python to extract all the data related to streetlights and lampost and save them as "light.csv".

​

We can run the code below to review the columns and their data types:

import pandas as pd​

df=pd.read_csv("light.csv")

print(df.dtypes)

​

​

About the data and techniques used: FAQ

The code will print out the following:

Unique Key                                      int64
Created Date                                  object
Closed Date                                    object
Complaint Type                               object
Descriptor                                        object
Status                                               object
Resolution Action Updated Date  object
Borough                                           object
dtype: object

About the data and techniques used: Text

THE DATASETS

UNIQUE KEY

This is an integer signifying the case's ID.

CREATED DATE

This is a series of strings of dates of when the cases were reported. This will later be transformed to DateTime using the following code.

df['Created Date'] = pd.to_datetime(df['Created Date']

CLOSED DATE

Is a series of strings of dates of when the cases were closed. This will later be transformed to DateTime using the following code.

df['Closed Date'] = pd.to_datetime(df['Closed Date'])

COMPLAINT TYPE

Is a series of strings about the type of complaint. For this project, we will only be looking at the Complaint Types called "Street Light Condition".

DESCRIPTOR

Is a series of strings detailing what kind of issue is wrong with the complaint. It could be about flickering lights, damaged lightbulbs, etc.

STATUS

Is a series of strings about the status of the case. A case could either be "Closed", "Opened", or "Assigned".

RESOLUTION ACTION UPDATED DATE

Is a series of strings of dates of when the cases were closed. This will later be transformed to DateTime using the following code.

df['Resolution Action Updated Date] = pd.to_datetime(df['Resolution Action Updated Date'])

BOROUGH

Is a series of strings detailing which borough in NYC the case is from.

About the data and techniques used: FAQ
bottom of page