:py:mod:`flow.record` ===================== .. py:module:: flow.record Subpackages ----------- .. toctree:: :titlesonly: :maxdepth: 3 adapter/index.rst fieldtypes/index.rst tools/index.rst Submodules ---------- .. toctree:: :titlesonly: :maxdepth: 1 base/index.rst exceptions/index.rst jsonpacker/index.rst packer/index.rst selector/index.rst stream/index.rst utils/index.rst whitelist/index.rst Package Contents ---------------- Classes ~~~~~~~ .. autoapisummary:: flow.record.FieldType flow.record.GroupedRecord flow.record.Record flow.record.RecordDescriptor flow.record.RecordField flow.record.JsonRecordPacker Functions ~~~~~~~~~ .. autoapisummary:: :nosignatures: flow.record.DynamicDescriptor flow.record.RecordAdapter flow.record.RecordReader flow.record.RecordWriter flow.record.extend_record flow.record.iter_timestamped_records flow.record.open_path flow.record.open_path_or_stream flow.record.open_stream flow.record.set_ignored_fields_for_comparison flow.record.stream Attributes ~~~~~~~~~~ .. autoapisummary:: flow.record.IGNORE_FIELDS_FOR_COMPARISON flow.record.RECORD_VERSION flow.record.RECORDSTREAM_MAGIC flow.record.dynamic_fieldtype .. py:data:: IGNORE_FIELDS_FOR_COMPARISON .. py:data:: RECORD_VERSION :value: 1 .. py:data:: RECORDSTREAM_MAGIC :value: b'RECORDSTREAM\n' .. py:function:: DynamicDescriptor(name, fields) .. py:class:: FieldType .. py:method:: default() :classmethod: Return the default value for the field in the Record template. .. py:class:: GroupedRecord(name, records) Bases: :py:obj:`Record` GroupedRecord acts like a normal Record, but can contain multiple records. See it as a flat Record view on top of multiple Records. If two Records have the same fieldname, the first one will prevail. .. py:method:: get_record_by_type(type_name) Get record in a GroupedRecord by type_name. :param type_name: The record type name (for example wq/meta). :type type_name: str :returns: None or the record .. py:method:: __repr__() Return repr(self). .. py:method:: __setattr__(attr, val) Enforce setting the fields to their respective types. .. py:method:: __getattr__(attr) .. py:class:: Record .. py:attribute:: __slots__ :value: () .. py:method:: __eq__(other) Return self==value. .. py:method:: __setattr__(k, v) Enforce setting the fields to their respective types. .. py:method:: __hash__() -> int Return hash(self). .. py:method:: __repr__() Return repr(self). .. py:function:: RecordAdapter(url: Optional[str] = None, out: bool = False, selector: Optional[str] = None, clobber: bool = True, fileobj: Optional[BinaryIO] = None, **kwargs) -> Union[flow.record.adapter.AbstractWriter, flow.record.adapter.AbstractReader] .. py:class:: RecordDescriptor(name: str, fields: Optional[Sequence[tuple[str, str]]] = None) Record Descriptor class for defining a Record type and its fields. .. py:property:: fields :type: Mapping[str, RecordField] Get fields mapping (without required fields). eg: { "foo": RecordField("foo", "string"), "bar": RecordField("bar", "varint"), } :returns: Mapping of Record fields .. py:property:: descriptor_hash :type: int Returns the (cached) descriptor hash .. py:property:: identifier :type: tuple[str, int] Returns a tuple containing the descriptor name and hash .. py:attribute:: name :type: str .. py:attribute:: recordType :type: type .. py:method:: get_required_fields() -> Mapping[str, RecordField] :staticmethod: Get required fields mapping. eg: { "_source": RecordField("_source", "string"), "_classification": RecordField("_classification", "datetime"), "_generated": RecordField("_generated", "datetime"), "_version": RecordField("_version", "vaeint"), } :returns: Mapping of required fields .. py:method:: get_all_fields() -> Mapping[str, RecordField] Get all fields including required meta fields. eg: { "ts": RecordField("ts", "datetime"), "foo": RecordField("foo", "string"), "bar": RecordField("bar", "varint"), "_source": RecordField("_source", "string"), "_classification": RecordField("_classification", "datetime"), "_generated": RecordField("_generated", "datetime"), "_version": RecordField("_version", "varint"), } :returns: Mapping of all Record fields .. py:method:: getfields(typename: str) -> RecordFieldSet Get fields of a given type. :param typename: The typename of the fields to return. eg: "string" or "datetime" :returns: RecordFieldSet of fields with the given typename .. py:method:: __call__(*args, **kwargs) -> Record Create a new Record initialized with `args` and `kwargs`. .. py:method:: init_from_dict(rdict: dict[str, Any], raise_unknown=False) -> Record Create a new Record initialized with key, value pairs from `rdict`. If `raise_unknown=True` then fields on `rdict` that are unknown to this RecordDescriptor will raise a TypeError exception due to initializing with unknown keyword arguments. (default: False) :returns: Record with data from `rdict` .. py:method:: init_from_record(record: Record, raise_unknown=False) -> Record Create a new Record initialized with data from another `record`. If `raise_unknown=True` then fields on `record` that are unknown to this RecordDescriptor will raise a TypeError exception due to initializing with unknown keyword arguments. (default: False) :returns: Record with data from `record` .. py:method:: extend(fields: Sequence[tuple[str, str]]) -> RecordDescriptor Returns a new RecordDescriptor with the extended fields :returns: RecordDescriptor with extended fields .. py:method:: get_field_tuples() -> tuple[tuple[str, str]] Returns a tuple containing the (typename, name) tuples, eg: (('boolean', 'foo'), ('string', 'bar')) :returns: Tuple of (typename, name) tuples .. py:method:: calc_descriptor_hash(name, fields: Sequence[tuple[str, str]]) -> int :staticmethod: Calculate and return the (cached) descriptor hash as a 32 bit integer. The descriptor hash is the first 4 bytes of the sha256sum of the descriptor name and field names and types. .. py:method:: __hash__() -> int Return hash(self). .. py:method:: __eq__(other: RecordDescriptor) -> bool Return self==value. .. py:method:: __repr__() -> str Return repr(self). .. py:method:: definition(reserved: bool = True) -> str Return the RecordDescriptor as Python definition string. If `reserved` is True it will also return the reserved fields. :returns: Descriptor definition string .. py:method:: base(**kwargs_sink) .. py:exception:: RecordDescriptorError Bases: :py:obj:`Exception` Raised when there is an error constructing a record descriptor .. py:class:: RecordField(name: str, typename: str) .. py:attribute:: name .. py:attribute:: typename .. py:attribute:: type .. py:method:: __repr__() Return repr(self). .. py:function:: RecordReader(url: Optional[str] = None, selector: Optional[str] = None, fileobj: Optional[BinaryIO] = None, **kwargs) -> flow.record.adapter.AbstractReader .. py:function:: RecordWriter(url: Optional[str] = None, clobber: bool = True, **kwargs) -> flow.record.adapter.AbstractWriter .. py:data:: dynamic_fieldtype .. py:function:: extend_record(record: Record, other_records: list[Record], replace: bool = False, name: Optional[str] = None) -> Record Extend ``record`` with fields and values from ``other_records``. Duplicate fields are ignored in ``other_records`` unless ``replace=True``. :param record: Initial Record to extend. :param other_records: List of Records to use for extending/replacing. :param replace: if ``True``, it will replace existing fields and values in ``record`` from fields and values from ``other_records``. Last record always wins. :param name: rename the RecordDescriptor name to ``name``. Otherwise, use name from initial ``record``. :returns: Extended Record .. py:function:: iter_timestamped_records(record: Record) -> Iterator[Record] Yields timestamped annotated records for each `datetime` fieldtype in `record`. If `record` does not have any `datetime` fields the original record is returned. :param record: Record to add timestamp fields for. :Yields: Record annotated with `ts` and `ts_description` fields for each `datetime` fieldtype. .. py:function:: open_path(path: str, mode: str, clobber: bool = True) -> IO Open ``path`` using ``mode`` and returns a file object. It handles special cases if path is meant to be stdin or stdout. And also supports compression based on extension or file header of stream. :param path: Filename or path to filename to open :param mode: Could be "r", "rb" to open file for reading, "w", "wb" for writing :param clobber: Overwrite file if it already exists if `clobber=True`, else raises IOError. .. py:function:: open_path_or_stream(path: Union[str, pathlib.Path, BinaryIO], mode: str, clobber: bool = True) -> IO .. py:function:: open_stream(fp: BinaryIO, mode: str) -> BinaryIO .. py:function:: set_ignored_fields_for_comparison(ignored_fields: Iterable[str]) -> None Can be used to update the IGNORE_FIELDS_FOR_COMPARISON from outside the flow.record package scope .. py:function:: stream(src, dst) .. py:class:: JsonRecordPacker(indent=None, pack_descriptors=True) .. py:method:: register(desc, notify=False) .. py:method:: pack_obj(obj) .. py:method:: unpack_obj(obj) .. py:method:: pack(obj) .. py:method:: unpack(d)