Denoise GPS data iteravly from large CSV Table using Python

Question

Welcome To Ask or Share your Answers For Others

Denoise GPS data iteravly from large CSV Table using Python

posted Feb 19, 2021 in Technique[技术] by 深蓝 (71.8m points)

Denoise GPS data iteravly from large CSV Table using Python

I have a dataset of around 430K GPS Locations in a CSV (They are from my Google Location Data if that helps), that i would like to denoise before further proceding with my project. There are some, i guess, easier ones to denoise, like in the image below, where the latitude varies by 3 degrees in a couple of seconds, but also a couple harder ones due to GPS Noise when moving slowly or not at all. There are also some that were from flying in a plane with points far away from each other (approx. 500km inbetween), due to few measurements and high speed or Car rides with no Cellular to send the Location to Google servers.

I would like to clear as much of the noise as possible up. I searched for a soulution and found "Kalman Filtering". I have very much only seen some stuff on smaller datasets that i guess would not work on such big one. Also i just can't wrap my head around Kalman Filtering so it is hard for me to Design one by myself. In addition because it's such a huge dataset i guess it would have to be done iterativly? Eg. grapping a few Values and then apply the filter, than the next few and so on. Also i guess one could use the Time Column to predict more precise?

EDIT: Thanks to this question and answer, I've found this code which i modified to read from my csv:

from pykalman import KalmanFilter
import numpy as np
import matplotlib.pyplot as plt
import time

measurements = np.genfromtxt('locations.csv', delimiter=',', names=True)

finalArray = []
for i in measurements:
    measureArray = np.asarray([i["Latitude"], i["Longitude"]])
    finalArray.append(measureArray)

measurements = np.asarray(finalArray)



initial_state_mean = [measurements[0, 0],
                      0,
                      measurements[0, 1],
                      0]

transition_matrix = [[1, 1, 0, 0],
                     [0, 1, 0, 0],
                     [0, 0, 1, 1],
                     [0, 0, 0, 1]]

observation_matrix = [[1, 0, 0, 0],
                      [0, 0, 1, 0]]



kf1 = KalmanFilter(transition_matrices = transition_matrix,
                  observation_matrices = observation_matrix,
                  initial_state_mean = initial_state_mean)

kf1 = kf1.em(measurements, n_iter=5)
(smoothed_state_means, smoothed_state_covariances) = kf1.smooth(measurements)


kf2 = KalmanFilter(transition_matrices = transition_matrix,
                  observation_matrices = observation_matrix,
                  initial_state_mean = initial_state_mean,
                  observation_covariance = 10*kf1.observation_covariance,
                  em_vars=['transition_covariance', 'initial_state_covariance'])

kf2 = kf2.em(measurements, n_iter=5)
(smoothed_state_means, smoothed_state_covariances)  = kf2.smooth(measurements)

plt.figure(2)
times = range(measurements.shape[0])
plt.plot(times, measurements[:, 0], 'bo',
         times, measurements[:, 1], 'ro',
         times, smoothed_state_means[:, 0], 'b--',
         times, smoothed_state_means[:, 2], 'r--',)
plt.show()

This works pretty well for a smaller Dataset, but would take very long with a much larger one. Any Idea how to do this iteravly?

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

Denoise GPS data iteravly from large CSV Table using Python

Denoise GPS data iteravly from large CSV Table using Python

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags