Friday, September 5, 2014

GPS Coordinate Analysis using R-Programming - Analytics


R-Programming GPS Analytics


Well I had a big data-set of coordinates and I have been asked to do analysis on it, I tried in Java , it did fairly well but I wanted something different , so I know what's the tool , so I went for R-Programming and I always love it :-).


So before doing anything, R already have enough libraries to use it, so little research and I concluded-

As Hmisc is one of my favourite because it has a large number of data anlysis and mathematical formulas already implemented which can be used to cleanse or segregate, data analysis or string manipulation and a series of more thing I just can't explain.

I had a huge dataset and I wanted to extract some important or required dataset from them, but since I liked query , so I used sqldf package .

And Finally, ggmap is an excellent library for ggmap allows for the easy visualization of spatial data and models on top of Google Maps, OpenStreetMaps, Stamen Maps, or CloudMade Maps using ggplot2.


Extracted data from csv file and then figured out longitude and latitude from the dataset.

 irdata = read.csv("D:/data_csv.csv", header = TRUE)  
 data = sqldf("select Longitude,Latitude from irdata where Latitude != 'N/A' and City == 'Mumbai' ")  
 data_long = as.numeric(levels(data$Longitude)[data$Longitude])  
 data_lat = as.numeric(levels(data$Latitude)[data$Latitude])  


Now since I have more than some 5000 points , so you can imagine, if I try to draw that , my map will be look like some spider web, I can't analyse anything , so I just extract what I required , or small amount of data-



 someCoords1 <- data.frame(long=data_long[100:200], lat=data_lat[100:200])  

Ok , now I have some data , so why not just try to find out the distance between each of these coordinates, we don't bring a huge data for comparison , I took just 6 and differences is in KM , 



 apply(someCoords1, 1, function(eachPoint) spDistsN1(as.matrix(someCoords1), eachPoint, longlat=TRUE))  

If you don't like Kilometer , then use longlat = FALSE

      [,1]        [,2]        [,3]     [,4]        [,5]     [,6]  
 [1,] 0.000000 16.6289935 9.937742 44.73177 15.7613710 17.661536  
 [2,] 16.628993 0.0000000 14.142999 29.38614 0.8795789 3.794917  
 [3,] 9.937742 14.1429990 0.000000 37.64567 13.3679239 17.083060  
 [4,] 44.731771 29.3861396 37.645667 0.00000 30.0890677 30.811816  
 [5,] 15.761371 0.8795789 13.367924 30.08907 0.0000000 4.162316  
 [6,] 17.661536 3.7949174 17.083060 30.81182 4.1623161 0.000000  

Now simply use get_map and pass the coordinates-
 mapgilbert <- get_map(location = c(lon = mean(mapdata$data_long), lat = mean(mapdata$data_lat)), zoom = 14,  
            maptype = "roadmap", source = "google")  

So now time to draw it , so I used geom_point on the map.



 ggmap(mapgilbert) +  
  geom_point(data = mapdata, aes(x = data_long, y = data_lat, fill = "red", alpha = 0.8), size = 3, shape = 21)  
 + expand_limits(x = data_long, y = data_lat)+guides(fill=FALSE, alpha=FALSE, size=FALSE)  


So now the output -


So isn't it amazing , well I found it great :-)

I tried some different kind of map as well for the same data-

Now we draw these , why not draw path between them-



 ggmap(mapgilbert) +   
  geom_path(aes(x = data_long, y = data_lat), data=mapdata ,alpha=0.2, size=1,color="yellow",lineend='round')+  
  geom_point(data = mapdata, aes(x = data_long, y = data_lat, fill = "red", alpha = 0.8), size = 3, shape = 21)  

So now these are the output based on maptype-












No comments: