SAS to R code First and last functions
I am relatively new to SAS and am working on converting a code from SAS to R. I came across this snippet which has me a little confused.
data A ; set B; by date id Units; retain Total; if first.id and last.id then do; Total=Units; output; end; else do ; if first.id then Total=Units; else Total=sum(Total,Units); if last.id then output; end; run;
If my understanding of this code is right, this snippet outputs a data set called A that is a union of (in SQL terminology) all fist and last occurrences of id and the last occurence of the id. Am I right? What is the purpose of the By statement then? I tried going through the SAS help but I still am confused.
Thanks in advance!
The by statement is responsible for the creation of the last and first variables this operates based on. Without it you don't have access to those variables.
What this code does is sums over the variable, similar to
proc sql; select id, sum(units) as total from b group by id; quit;
Basically, if you are on a single row (one row for that ID), Total=Units; otherwise, on the first row, set total=units, then for each additional row, add units to total, and then on the last row for that ID, output a row with the total.