An overview
Classification/data extraction
Classification takes place exclusively using the document's content; all required data is extracted from the documents themselves, not from the original document name. Only the data required in the naming convention (excluding suffixes such as "no." if a number is required) is extracted.
If multiple content/data fields are to be captured, all the data to be extracted here is generally shown in the new document name automatically and separated by an underscore. No underscores are used within a data field.
Incorrect spelling is not transferred, but adapted to the new German spelling accordingly.
The document classes can also be selected through differentiation from other classes (example: In Architrave Standard, maintenance reports are divided into different trades; if a maintenance report cannot be clearly assigned to one trade, it is classified as "Maintenance report, other".).
The main document is always identified and other scanned cover sheets or letters ignored.
Documents that cannot be clearly assigned to a single document class are stored in the folder "Unclassifiable" under the name "Unknown_Initial document name (upload name)", allowing them to be processed further, e.g. moved or renamed, by users at any time.
Document appendices
The uploader can add a comment during the upload process to define which documents should be considered appendices to which documents, if for example, appendices to documents already in the data space are uploaded. These can be classified to match the main document and are given the suffix "Appendix" or a suffix chosen by the uploader.
If a main document and multiple appendices are uploaded at the same time, it is possible for the uploader to link the appendices to the main document during the upload process. Any number of documents can be linked to a main document. This does not impair assignment or classification.
Multidocs
If a scanned document consists of multiple documents from different document classes (= Multidoc), the first page(s) of the document are usually taken as the basis for classification, as it is can often be assumed that this is a main document including appendices. The document is classified according to the main document identified: A residential tenancy agreement including a floor plan is always classified as a residential tenancy agreement. The new document name does not make reference to any appendices.
Depending on the structure of the document, the relevant part may be located further inside the document and is then classified in order of priority.
Example 1: If a document contains an invoice for a maintenance service including the maintenance report, it is classified as a maintenance report and the suffix "incl. invoice" is noted in the supplementary information.
Example 2: If a document contains a rent adjustment letter including a new permanent rent invoice, it is classified as a rent invoice.
Example 3: If a building application is present with the associated building permit (usually as a complete construction file), the document is classified as a building permit.
Duplicates
In the event that the same document is uploaded multiple times, these are currently not recognised as duplicates, but instead shown in the data space with identical classification, names and numbering.
Example:
Land register excerpt_Teltow_234_2020-01-01.pdf
Land register excerpt_Teltow_234_2020-01-01 (2).pdf
Land register excerpt_Teltow_234_2020-01-01 (3).pdf
etc.
Formats
The following file types are currently processed by DELPHI/quality management:
bmp, ctb (AutoCAD plot file), dng, doc(x), dwg, dxf, gif, graphic files such as: max, vrmesh, tga, psd, eps, MOV and gif, htm(l) (websites), jp(e)g, ldb (Microsoft Access), msg, pdf, PLT, png, ppt(x), rar and zip (are unzipped), rtf, thmx, tif, txt, vwx, vxm (plan files such as dwg), xls(x), xlsm.
Naming convention/styles of writing
To ensure that recurring terms are used in a standardised manner, the following naming rules, among other things, have been defined:
Style of writing for floor levels:
UPG, BF, LF, GF, UF1, UF2, TF
Any irregularities are written out (e.g. gallery level, intermediate floor, technical level, etc.)
Style of writing for contractual partners/company names:
Company names are written exactly as they appear in the document; capitalisation is always maintained and the company form always recorded.
Style of writing for house numbers:
No zeros are written before single-digit house numbers. Letters are placed in lowercase directly after the number (example: 1a)
Style of writing for construction projects:
As building permit titles can be very long, this is shortened, e.g. by leaving out articles.
Example: Conversion of a warehouse into an office building 🡪 Conversion warehouse into office building
Style of writing for plan content:
The plan content is taken from the header. Directions are recorded as follows: North, East,
South, West and NE, NW, SW, etc.
Style of writing for lists:
If, e.g. multiple technical systems are listed in the document and the naming convention requires a system number as a data field, only the first system number and "i.a." appear in the new document name.
Style of writing for dates:
If the current date only consists of the month and year or only the year, the first day of the month is chosen.
Example: (January) 2020 🡪 01.01.2020 or 2020-01-01
Style of writing for data field "Period represented":
Example for year: 01.01.2018–31.12.2018 🡪 2018
Example for quarter: 01.01.2018–31.03.2018 🡪 2018 Q1
Example for multiple months: 06.2016–10.2017 🡪 2017-10 - 2016-06 (with more recent date first)
Style of writing for data field "Supplementary information for designation":
If documents clearly refer to e.g. specific building sections, this can be noted here.