SparkR :: gapply How to use LinearRegression across groups in DataFrame?

Closed Posted 2 years ago Paid on delivery
Closed

Hi there

I have big data which I am using for applying linear model to each group. I have small example of the data for the principle I want to have parallelised.

# Determine six waiting times with the largest eruption time in minutes.

schema <- structType(structField("waiting", "double"), structField("max_eruption", "double"))

result <- gapply(

df,

"waiting",

function(key, x) {

y <- [login to view URL](key, max(x$eruptions))

},

schema)

head(collect(arrange(result, "max_eruption", decreasing = TRUE)))

Data Mining R Programming Language

Project ID: #30580205

About the project

4 proposals Remote project Active 2 years ago

4 freelancers are bidding on average €10/hour for this job

Annmarie1995

Hi I am a professional statistician with 5 years of experience. I have read the job description. I will help you complete the project. i have skills in Data Mining and R Programming Language. I can deliver quality an More

€16 EUR / hour
(23 Reviews)
4.9
WycOj

EXPERT IN STATISTICS Hello there, I am best in statistics, R programming analysis of data, SPSS, Statistical/Data Analysis, Multivariate Statistical Analysis, Regression Analysis, STATA, MINITAB, R language, Factor Ana More

€10 EUR / hour
(19 Reviews)
4.4
ibahimakerkouch

Hi, I have a big experience on R programming also I am a master's degree in data science. You can see my profile and my reviews to prove to you that I worked well on R projects. Your project is a challenge for me. Le More

€4 EUR / hour
(20 Reviews)
4.3
StatisticandArt

Hi, I graduated Bachelor of Statistics. I have experience using R because that application have been learned when i was college. I am also a specialist in Basic Statistical Analysis (descriptive analysis, graph, chart More

€8 EUR / hour
(10 Reviews)
3.2