In-Vehicle Network Intrusion Detection Challenge
This dataset was used at the “In-vehicle Network Intrusion Detection track’ of ‘Information Security R&D Data Challenge 2019.’
It consists of in-vehicle network traffic data of HYUNDAI Sonata, KIA Soul, and CHEVROLET Spark.
The dataset includes Normal data and Attack data (i.e., Flooding, Fuzzy, Malfunction, and Replay attack).
* Class is ‘R’ or ‘T’ – ‘R’ is a normal message, and ‘T’ is an attack message.
The preliminary dataset includes the three types of vehicles (CHEVROLET Spark, HYUNDAI Sonata, KIA Soul) and attack (Flooding, Fuzzy, Malfunction).
The final dataset divided two sessions, and each session consists of different types of attacks. In the final, 1st session includes the Fuzzy and Malfunction attack, 2nd session includes Malfunction and a Replay attack.
The dataset has Timestamp (logging time), CAN ID (CAN Identifier), DLC (Data length code), Payload (CAN data field). Also, each dataset of the train has a Class field.
All dataset was collected on the stationary state of each vehicle.
3. Dataset types
- The normal traffic data in CAN bus.
- The CAN IDs send the data into the CAN Bus to inform the current status of the embedded sensor and devices for assisting with operating the vehicle.
- When CAN messages transmitted from several different sender ECU nodes are simultaneously transmitted to a receiver ECU node, the values of the CAN IDs are compared to determine the priority of the CAN message to be accepted first.
- The lower the value of a CAN ID is the higher its priority.
- This attack can limit the communications among ECU nodes and disrupt normal driving.
- For the fuzzy attack, we randomly generated CAN message.
- This process was conducted for both the ID field and the Data field.
- The randomly generated CAN ID ranged from 0×000 to 0×7FF and included both CAN IDs originally extracted from the vehicle and CAN IDs which were not.
- The malfunction attack targets a selected CAN ID from among the extractable CAN IDs of a certain vehicle.
- For a malfunction attack, the manipulation of the data field has to be simultaneously accompanied by the injection attack of randomly selected CAN IDs.
- When the values in the data field consisting of 8 bytes were manipulated using 00 or a random value, the vehicles reacted abnormally.
- The normal data from the attack free dataset injects to the CAN Bus as a normal dataset.
- The replay attack causes a problem by injecting a set of CAN messages extracted and logged in a certain order into the vehicle network.
- The advantage here is that it does not require the reverse-engineering process for interpreting the meaning and function of the IDs and Data fields.
4. Dataset Download
Dataset Link: https://forms.gle/EtnozTgreCWcAF2F6
5. Related Publication
Mee Lan Han, Byung Il Kwak, and Huy Kang Kim. “Anomaly intrusion detection method for vehicular networks based on survival analysis.” Vehicular Communications 14 (2018): 52-63.