A report in a word document and python script which should include comments to explain codes. Under files, the brief is attached.
Find two or more publicly accessible datasets on the web which are to be used to answer a
research question you are interested in. You may need to go through the below tasks (a)-(d)
multiple times in order to arrive at a meaningful research question and findings.
(a) Identify two (2) or more datasets based on your interest. Refer to Appendix 2 for possible
sources of datasets but you are not limited to those sources listed. Analyse what are
present in the datasets, how the data is organized and what common features to merge
the datasets. Are there data quality issues in the dataset (missing data, etc.)? Do you
need to clean and/or transform the raw data for analysis?
(b) Form a research question interesting to you, after you have a good understanding of the
data. Refer to Appendix 1 for examples suitable for this assignment. Describe how you
will answer the stated research question. Clearly define the measures you are going to
calculate in order to answer your question. Refer to Example 3 in Appendix 1 as an
(c) Apply summary statistics and graphical plots (or other visualizations) to address the
stated research question by explaining your findings and observations. High marks are
awarded for in-depth analyses with real-world considerations and practical
ANL251 Copyright © 2018 Singapore University of Social Sciences (SUSS) Page 4 of 7
ECA – July Semester 2018
(d) Identify possible limitations of your findings. For example, are they limited to a certain
city or country? Are you making assumptions about the data which may, or may not,
Present your work for all the tasks (a)-(d) in your report using the provided template
(Appendix 3). Provide screenshots of the relevant Python code for each task and its output
where appropriate. Keep your report concise and coherent as a self-contained entity. The
evaluation criteria also include logical flow of your explanation, variety of the visualisation
employed appropriate summary statistic used, and novelty as well.
For a breakdown of the marks, please refer to Appendix 3.
Write the code you used for implementing the tasks (a)-(d) in a .py file. Use functions where
appropriate to organize your code well. The program should have sufficient comments to
describe the corresponding steps and logics for the various tasks.
Appendix 1 Proposing Research Questions
Example 1: What is the GDP of the U.S. for 2011?
Too Narrow. This is just asking for a fact or a single data point.
Example 2: What is the primary reason for global poverty?
Too Broad. What data will you use to answer this question? Would an answer be defendable
given the datasets?
Appendix 2 Sources to look for datasets
Singapore’s Public Data, [login to view URL]
Smart Nation Singapore, [login to view URL]
the United Nations, [login to view URL]
the World Bank, [login to view URL]
the Global Open Data Index [login to view URL]
US Government Data, [login to view URL]
UK Government Data, [login to view URL]
Canada's Open Data Exchange, [login to view URL]
11 freelancers are bidding on average $162 for this job
Hello! I am a python developer. I looked at your project and it seems interesting. I have all necessary skills required for this project. Ping me to discuss in detail.
Hey there I have gone through your requirements and I can help you with this, I did such a task whole in colleges s fee years back so I have an idea how to do about it looking forward to hearing from you. cheers