pyMAISE.datasets.load_anomaly

pyMAISE.datasets.load_anomaly(input_path, output_path, stack_series=False, multiclass=False, propagate_output=False, non_faulty_frac=1.0, timestep_step=1)[source]

Load time series electronic signal data from Mendeley provided by [RPW+23, RPW+22]. This dataset derives from the measurement of 14 parameters of the high voltage converter modulators (HVCMs) used at the Spallation Neutron Source facility. Each of these waveforms were classified as “fault” or “run” depending on the failure of the HVCM during operation.

The 14 waveform inputs are:

  • A+IGBT-I: current: Current passing through the IGBT switch of phase A+ in Qa1 (\(A\))

  • A+*IGBT-I: current: Current passing through the IGBT switch of phase A+* in Qa3 (\(A\))

  • B+IGBT-I: current: Current passing through the IGBT switch of phase B+ in Qb1 (\(A\))

  • B+*IGBT-I: current: Current passing through the IGBT switch of phase B+* in Qb3 (\(A\))

  • C+IGBT-I: current: Current passing through the IGBT switch of phase C+ in Qc1 (\(A\))

  • C+*IGBT-I: current: Current passing through the IGBT switch of phase C+* in Qc3 (\(A\))

  • A-Flux: Magnetic flux density for phase A in transformer XA (\(-\))

  • B-Flux: Magnetic flux density for phase B in transformer XB (\(-\))

  • C-Flux: Magnetic flux density for phase C in transformer XC (\(-\))

  • Mod-V: Modulator voltage (\(V\))

  • Mod-I: Modulator current (\(A\))

  • CB-I: Cap bank current (\(-\))

  • CB-V: Cap bank voltage (\(V\))

  • DV/DT: Time derivative change of the Mod-V voltage (\(-\))

There is one output for this dataset:

  • Class_Run/Class_Fault: Whether a waveform is a part of a system fault

Note

The outputs returned by this are not one-hot encoded. It is a single label with class “Run” and “Fault”.

Parameters:
  • input_path (str) –

    Path to input file. Raw data can be found at Mendeley

  • output_path (str) –

    Path to output file. Raw data can be found at Mendeley

  • stack_series (bool, default=False) – If true, then the samples and time steps dimensions are combined. propagate_output must be true for stack_series to be true.

  • multiclass (bool, default=False) – If true, then the multiclass case is returned with 8 possible classifications: Normal, IGBT Fault, `` CBor TPS Fault``, Driver Fault, Flux Fault, DV/DT Fault, SCR AC Input Fault, or Misc/Unknown. If this is false then the binary class is returned (Run or Fault).

  • non_faulty_frac (float, default=1.0) – The fraction of non-faulty data to include.

  • timestep_step (int, default=1) – Time steps are taken every other timestep_step. When timestep_step == 1 all timesteps are given.

Returns:

  • inputs (xarray.DataArray) – 14 inputs.

  • outputs (xarray.DataArray) – 1 output.