dataset‎ > ‎


CAN Dataset for intrusion detection


Controller Area Network (CAN) is a bus communication protocol which defines a standard for reliable and efficient transmission between in-vehicle nodes in real-time. Since CAN message is broadcast from a transmitter to the other nodes on a bus, it does not contain information about the source and destination address for validation. Therefore, an attacker can easily inject any message to lead system malfunctions. In this paper, we propose an intrusion detection method based on the analysis of the offset ratio and time interval between request and response messages in CAN. If a remote frame having a particular identifier is transmitted, a receiver node should respond to the remote frame immediately. Thus, each node has a fixed response offset ratio and the time interval in a normal state while these values vary in attack state. Using this property, we can measure the response performance of the existing nodes based on the offset ratio and time interval between request and response messages. As a result, our methodology can detect intrusions by monitoring the offset ratio and time interval, and it allows quick intrusion detection with high accuracy.

1. Dataset

We provide datasets which include DoS attack, fuzzy attack, impersonation attack, and attack free states. Datasets were constructed by logging CAN traffic via the OBD-II port from a real vehicle while message injection attacks were performing. 
We extracted the in-vehicle data from KIA SOUL.

    1.    DoS Attack : Injecting messages of ‘0x000’ CAN ID in a short cycle.
    2.    Fuzzy Attack : Injecting messages of spoofed random CAN ID and DATA values.
    3.    Impersonation Attack : Injecting messages of Impersonating node, arbitration ID = '0x164'.
    4.    Attack Free State: Normal CAN messages.

1.1 Data attributes

Timestamp, CAN ID, DLC, DATA[0], DATA[1], DATA[2], DATA[3], DATA[4], DATA[5], DATA[6], DATA[7], flag

    1.    Timestamp : recorded time (s)
    2.    CAN ID : identifier of CAN message in HEX (ex. 043f)
    3.    DLC : number of data bytes, from 0 to 8
    4.    DATA[0~7] : data value (byte)
    5.    Flag : T or R, T represents injected message while R represents normal message

1.2 Summary of our dataset

 Attack Type # of messages 
 DoS Attack 656,579 
Fuzzy Attack  591,990 
Impersonation Attack 995,472 
Impersonation Attack2
(Newly generated Sep. 7. 2017)
Impersonation Attack2
(Newly generated Sep. 7. 2017) 
 Attack free state2,369,868 

For academic purpose, we are happy to release our datasets.
    1.    DoS Attack [txt]
    2.    Fuzzy Attack [txt]
    3.    Impersonation Attack [txt]
        - Impersonation Attack_2 (Newly generated Sep. 7. 2017) [txt]
        - Impersonation Attack_3 (Newly generated Sep. 7. 2017) [txt]
    4.    Attack Free State [txt]

2. Publication

3. Contact
  • If you have any questions about our study and the dataset, please feel free to contact us for further information.
  • Hyunsung Lee (line at or Huy Kang Kim (cenda at