## Time series data modeling predicts new coronavirus

The first article : I am Octopus, the name comes from my Chinese name - octopus; I love programming, algorithms, and open source. All the source code is in my personal github  ; this blog is to record the bits and pieces of my learning, if you are interested in Python, Java, AI, and algorithms, you can follow my news, learn together, and make progress together.

related articles:

Article directory

1. Prepare the data

Second, define the model

Third, train the model

Fourth, the evaluation model

Fifth, use the model

Six, save and use the model

The domestic new crown pneumonia epidemic has been going on for more than 3 months since it was discovered. This disaster, which originated from eating game, has affected everyone's life in many ways.

Some classmates are income, some are emotional, some are psychological, and some are weight. So when will the domestic new crown pneumonia epidemic end? When will we be free again?

This article will use TensorFlow2.0 to build a time series RNN model to predict the end time of the domestic new crown pneumonia epidemic

### 1. Prepare the data

The data set in this article is taken from tushare, and the data set is in the data directory of this project.

``import numpy as npimport pandas as pd import matplotlib.pyplot as pltimport tensorflow as tf from tensorflow.keras import models,layers,losses,metrics,callbacks ``
``% matplotlib inline%config InlineBackend.figure_format = 'svg' df = pd.read_csv("./data/covid-19.csv",sep = "\t")df.plot(x = "date",y = ["confirmed_num","cured_num","dead_num"],figsize=(10,6))plt.xticks(rotation=60)``

``dfdata = df.set_index("date")dfdiff = dfdata.diff(periods=1).dropna()dfdiff = dfdiff.reset_index("date") dfdiff.plot(x = "date",y = ["confirmed_num","cured_num","dead_num"],figsize=(10,6))plt.xticks(rotation=60)dfdiff = dfdiff.drop("date",axis = 1).astype("float32")``

``#Use the window data of 8 days before a certain day as input to predict the data of the dayWINDOW_SIZE = 8 def batch_dataset(dataset):    dataset_batched = dataset.batch(WINDOW_SIZE,drop_remainder=True)    return dataset_batched ds_data = tf.data.Dataset.from_tensor_slices(tf.constant(dfdiff.values,dtype = tf.float32)) \   .window(WINDOW_SIZE,shift=1).flat_map(batch_dataset) ds_label = tf.data.Dataset.from_tensor_slices(    tf.constant(dfdiff.values[WINDOW_SIZE:],dtype = tf.float32)) #Data is small, you can put all training data into one batch to improve performanceds_train = tf.data.Dataset.zip((ds_data,ds_label)).batch(38).cache()``

### Second, define the model

There are three ways to build a model using the Keras interface: use Sequential to build a model in layer order, use a functional API to build an arbitrary structural model, and inherit the Model base class to build a custom model.

The choice here is to build arbitrary structural models using a functional API.

``#Considering that newly diagnosed, newly cured, the number of new deaths cannot be less than 0, the following structure is designedclass Block(layers.Layer):    def __init__(self, **kwargs):        super(Block, self).__init__(**kwargs)        def call(self, x_input,x):        x_out = tf.maximum((1+x)*x_input[:,-1,:],0.0)        return x_out        def get_config(self):          config = super(Block, self).get_config()        return config``
``tf.keras.backend.clear_session()x_input = layers.Input(shape = (None,3),dtype = tf.float32)x = layers.LSTM(3,return_sequences = True,input_shape=(None,3))(x_input)x = layers.LSTM(3,return_sequences = True,input_shape=(None,3))(x)x = layers.LSTM(3,return_sequences = True,input_shape=(None,3))(x)x = layers.LSTM(3,input_shape=(None,3))(x)x = layers.Dense(3)(x) #Considering that newly diagnosed, newly cured, the number of new deaths cannot be less than 0, the following structure is designed#x = tf.maximum((1+x)*x_input[:,-1,:],0.0)x = Block()(x_input,x)model = models.Model(inputs = [x_input],outputs = [x])model.summary() ``
``Model: "model"___________________________________________________Layer (type)                 Output Shape              Param #   ===================================================== ===============input_1 (InputLayer)         [(None, None, 3)]         0         ___________________________________________________lstm (LSTM)                  (None, None, 3)           84        ___________________________________________________lstm_1 (LSTM)                (None, None, 3)           84        ___________________________________________________lstm_2 (LSTM)                (None, None, 3)           84        ___________________________________________________lstm_3 (LSTM)                (None, 3)                 84        ___________________________________________________dense (Dense)                (None, 3)                 12        ___________________________________________________block (Block)                (None, 3)                 0         ===================================================== ===============Total params: 348Trainable params: 348Non-trainable params: 0___________________________________________________``

### Third, train the model

There are usually 3 ways to train a model, the built-in fit method, the built-in train_on_batch method, and a custom training loop. Here we choose the most common and simplest built-in fit method.

Note: It is difficult to debug the cyclic neural network, and it is necessary to set multiple different learning rates for multiple attempts to achieve better results.

``#Custom loss function, considering the ratio of squared difference and predicted targetclass MSPE(losses.Loss):    def  call ( self ,y_true,y_pred) :        err_percent = (y_true - y_pred)**2/(tf.maximum(y_true**2,1e-7))        mean_err_percent = tf.reduce_mean(err_percent)        return mean_err_percent        def get_config(self):        config = super(MSPE, self).get_config()        return config ``
``import themimport datetime optimizer = tf.keras.optimizers.Adam(learning_rate=0.01)model.compile(optimizer=optimizer,loss=MSPE(name = "MSPE")) stamp = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")logdir = os.path.join('data', 'autograph', stamp) ## It is recommended to use pathlib to correct the path of each operating system under Python3# from pathlib import Path# stamp = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")# logdir = str(Path('./data/autograph/' + stamp)) tb_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)# If the loss does not increase after 100 epochs, the learning rate is halved.lr_callback = tf.keras.callbacks.ReduceLROnPlateau(monitor="loss",factor = 0.5, patience = 100)#When the loss does not improve after 200 epochs, the training is terminated early.stop_callback = tf.keras.callbacks.EarlyStopping(monitor = "loss", patience= 200)callbacks_list = [tb_callback,lr_callback,stop_callback] history = model.fit(ds_train,epochs=500,callbacks = callbacks_list) ``

### Fourth, the evaluation model

The evaluation model generally needs to set a validation set or a test set. Due to the small amount of data in this case, we only visualize the iteration of the loss function on the training set.

``% matplotlib inline%config InlineBackend.figure_format = 'svg' import matplotlib.pyplot as plt def plot_metric(history, metric):    train_metrics = history.history[metric]    epochs = range(1, len(train_metrics) + 1)    plt.plot(epochs, train_metrics, 'bo--')    plt.title('Training '+ metric)    plt.xlabel("Epochs")    plt.ylabel(metric)    plt.legend(["train_"+metric])    plt.show() ``
``plot_metric(history,"loss")``

### Fifth, use the model

Here we use the model to predict the end of the epidemic, that is, the time when the number of new confirmed cases is 0.

``#Use dfresult to record existing data and future predicted epidemic datadfresult = dfdiff[["confirmed_num","cured_num","dead_num"]].copy()dfresult.tail()``
``#Predict the new trend in the next 100 days and add the result to dfresultfor i in range(200):    arr_predict = model.predict(tf.constant(tf.expand_dims(dfresult.values[-38:,:],axis = 0)))     dfpredict = pd.DataFrame(tf.cast(tf.floor(arr_predict),tf.float32).numpy(),                columns = dfresult.columns)    dfresult = dfresult.append(dfpredict,ignore_index=True)``
``dfresult.query("confirmed_num==0").head() # On the 55th day, the number of new confirmed diagnoses dropped to 0, and the 45th day corresponds to March 10, that is, 10 days later, that is, it is expected that the new confirmed diagnoses will drop to 0 on March 20.# Note: The forecast is optimistic``
``dfresult.query("cured_num==0").head() # The 164th day starts to reduce the new cure to 0, and the 45th day corresponds to March 10th, which is about 4 months later, that is, July 10th.# Note: The forecast is pessimistic and has problems. If the daily number of newly cured patients is added up, it will exceed the cumulative number of confirmed diagnoses.``
``dfresult.query("dead_num==0").head() # Starting from the 60th day, the number of new deaths is reduced to 0, and the 45th day corresponds to March 10, which is about 15 days later, that is, 20200325# This prediction is reasonable``

### Six, save and use the model

``model.save('./data/tf_model_savedmodel', save_format="tf")print('export saved model.')``
``model_loaded = tf.keras.models.load_model('./data/tf_model_savedmodel',compile=False)optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)model_loaded.compile(optimizer=optimizer,loss=MSPE(name = "MSPE"))model_loaded.predict(ds_train)``