Straightforward R code for contingency analysis
$50-150 CAD
Paid on delivery
The project involves contingency analysis of data following the binominal distribution. Imagine a system composed of n points which we observed over certain time (observation times are in years, and these times differ among locations). At each of the points we can observe none, one or several events during the observation period (which is specific to that point). During one year, a point may have none or 1 event only. It follows that during one year a system may exhibit anything between 0 and n events, where n is the number of points which are active (recording) in that year.
We need to know what is the probability to observe simultaneous occurrence of k events across our system during a year, given
1 - number of sites covering a particular period
2 - total number of years with such events during that period, and
3 - length of that period, in years
The code should evaluate the theoretically expected frequencies of years when events are recorded at 0,1, ... n points. n is a maximum number events which we observed at different points during one year. In other words, we need to calculate joint probabilities of event occurrence. To calculate expected frequencies we assume the binominal distribution of the events:
SEE FORMULA IN ATTACHED FILE
where N is the total number of recording points in the analysis of a specific period; X – number of events in a single year; p – the probability of a site exhibiting an event in any year, and q – inverse of this probability. The differences between expected and observed frequencies are to be estimated by the Chi-square test.
Finally, the results should be bootstrapped. The bootstrapping part should be arranged like that:
1. bootstrapping operates on a moving time frame of a length specified by the user.
2. frame moves from the start of the specified period to its end, shifting by one year at a time.
3. at each frame position, the code calculates the required statistics.
The code will be fed by the data files with (a) record of events at each point and (b) the time period covered by each point (i.e. the period when our point was "recording" events, so to say).
As user input, the code should take specifications on the length of the period to analyse (data in original input file will cover larger period then the period we are interested in) and the length of the time frame for bootstrapping.
As output, I would need observed and expected frequencies of years with 0, 1 ... n events, and respective Chi-square statistics, for each position of the time frame. All these variables should come with bootstrap-generated 90% and 95% confidence envelopes.
I need the code to be written in R and well commented. Please, don't bid on this project if you cannot write the code in R.
Excel file with in-cell formulas to calculate expected frequencies will be provided. This file will also contain examples.
Project ID: #5785533
About the project
5 freelancers are bidding on average $137 for this job
The project looks simpler that the ones I worked on till now. It will take 1-2 days but I have kept a margin of 2 more days because of weekend.
HI Brother, I am Data Scientist working in Multinational Company. My work is to see the hidden pattern in the large and complex data sets and predictive analytics, Data mining,Machine Learning and also uses the stati More
Hi, I am Nikhil from India. I can do this project quite easily. You will get 100% accuracy and satisfaction. Thanks, Nikhil Gupta