Donnerstag, 26. Oktober 2023

Microsoft Outlook LDAP addressbook bad performance with 389 Directory Server on SuSE SLES 15

After migrating from OpenLDAP to 389 Directory Server a customer experienced bad perfomance in the addressbook connection from Microsoft Outlook. Queries took about 5 - 10 seconds on an LDAP tree containing approx. 5.000 records.

The reason was a missing SUB index on the field displayName.

An investigation of the LDAP-logfile /var/log/dirsrv/SERVERNAME/access shows the search query which is send from Outlook:

(&(mail=*)(|(mail=Pinnau*)(cn=Pinnau*)(sn=Pinnau*)(givenName=Pinnau*)(displayName=Pinnau*)))

Outlook searches for entries which MUST have an email AND at least one of the fields mail, cn, sn, givenName or displayName MUST match a begins with condition.
This query is fix and cannot be changed in the Outlook settings.

To get query performance on the server-side, an SUB (Substring) index for each field is required.

It turned out that the index for displayName is missing in the default setup of 389 Directory Server.

To get an overview of the indexes you can run the db2index command:

sles15:~ # dsctl INSTANCE_NAME stop
sles15:~ # dsctl INSTANCE_NAME db2index userRoot
...
sles15:~ # - INFO - bdb_db2index - userroot: Indexing attribute: aci
sles15:~ # - INFO - bdb_db2index - userroot: Indexing attribute: cn
sles15:~ # - INFO - bdb_db2index - userroot: Indexing entryrdn
sles15:~ # - INFO - bdb_db2index - userroot: Indexing attribute: entryusn
sles15:~ # - INFO - bdb_db2index - userroot: Indexing attribute: gidnumber
sles15:~ # - INFO - bdb_db2index - userroot: Indexing attribute: givenname
...
sles15:~ # dsctl INSTANCE_NAME start

Create a LDIF-file to add the missing index:

dn: cn=displayName,cn=index,cn=userroot,cn=ldbm database,cn=plugins,cn=config
objectClass: top
objectClass: nsIndex
cn: displayName
nsSystemIndex: false
nsIndexType: pres
nsIndexType: eq
nsIndexType: sub
nsIndexType: approx
nsMatchingRule: 1.3.6.1.4.1.42.2.27.9.4.76.1

Use ldapmodify to create the index:

sles15:~ # ldapmodify -a -D "cn=Directory Manager" -W -h localhost -x < index_displayname.ldiff

And rebuild indexes:

sles15:~ # dsctl INSTANCE_NAME stop
sles15:~ # dsctl INSTANCE_NAME db2index userRoot
...
sles15:~ # - INFO - bdb_db2index - userroot: Indexing attribute: aci
sles15:~ # - INFO - bdb_db2index - userroot: Indexing attribute: cn
sles15:~ # - INFO - bdb_db2index - userroot: Indexing attribute: displayname
sles15:~ # - INFO - bdb_db2index - userroot: Indexing entryrdn
sles15:~ # - INFO - bdb_db2index - userroot: Indexing attribute: entryusn
sles15:~ # - INFO - bdb_db2index - userroot: Indexing attribute: gidnumber
sles15:~ # - INFO - bdb_db2index - userroot: Indexing attribute: givenname
...
sles15:~ # dsctl INSTANCE_NAME start 

The created index for displayName must appear in the output of db2index. The Outlook address book should speed up and give results in less then 1 second.


Dienstag, 7. Januar 2020

Enable RDP and Samba / Shares in Windows 10 Firewall

If connections to RDP or Samba shares to a Windows 10 machine fails you may check the firefall configuration on the client.

Click on "Start" und type "firewall". Navigate to "Windows Defender Firewall mit erweiterter Sicherheit" (extended security).

A window opens. Navigate to "Eingehende Regeln" (incoming rules) in the tree on the right side.

Enable the following rules for the "Domäne" (domain) profile (3rd colun):

  • Datei- und Druckerfreigabe (NB-Sitzung eingehend), Protokoll TCP, Port 139
  • Datei- und Druckerfreigabe (SMB eingehend), Protokoll TCP, Port 445 
  • Remotedesktop - Benutzermodus (TCP eingehend)
Of course the corresponding functions have to be enabled and configured properly.

Donnerstag, 17. Januar 2019

Java Mai API 1.6.2. and UTF8 headers com.sun.mail - javax.mail.jar OpenJDK 11

I went into some trouble when upgrading the System Concept DMS project to run with OpenJDK 11. Since Java 11 does not include an implementation of javax.mail anymore I downloaded the current implementation from maven repo:

http://central.maven.org/maven2/com/sun/mail/javax.mail

At this time recent version is 1.6.2. Some mail related unit-tests failed. The reason was that mail headers added via MimeMessage.addHeader(String name, String data) now writes utf8 encoded strings into the mail source. I use MimeMessage.writeTo(OutputStream stream) to persist the message to an EML file. This file differed from the expected content (which was not uft-8 encoded) and so the tests failed.
So far so good. After fixing the tests by just replacing the expected results with utf-8 encoded data I ran into the next - and real - problem: the headers are written utf-8 encoded but they are not read properly when opening an EML file via new MimeMessage(Session session, InputStream in). I wasted some hours trying to find an outdated library somwhere in the sub-projects. I was not able to find any hints on the web.
Finally I found the solution in com.sun.mail sources: You will have to add a property "mail.mime.allowutf8" = "true" to the JavaMail Session:

Properties props = System.getProperties();
props.put("mail.host", "smtp.dummydomain.com");
props.put("mail.transport.protocol", "smtp");
props.put("mail.mime.allowutf8", "true");
session = Session.getDefaultInstance(props, null);

When setting this property the com.sun.mail implementation reads utf-8 encoded headers. I do not know why this property is evaluated on reading but not on writing the headers. From my point of view this is inconsistent.

Freitag, 7. September 2018

MacOS deployment and ship Java Runtime JRE with Eclipse RCP product

This is not quite simple. Please follow the instructions to ship your RCP product with a JRE on MacOS.

Prerequisites


You need a working RCP product export with Maven Tycho to build a MacOS product.

1. Install Java Development Kit (JDK) on a MacOS system


Download the installation package from Oracle and install on a MacOS target system. You need a JDK. JRE is not sufficent. Dont't care about the size since we will delete everything but the JRE later on.

2. Copy the JDK to your product


The (for now) current JDK 1.8.181 is installed into that location:

/library/Java/JavaVirtualMachines/jdk1.8.0_181.jdk

Open a shell and copy the JDK to your development project:
user@mac: cp -R /library/Java/JavaVirtualMachines/jdk1.8.0_181.jdk /Users/peter/...


To uninstall the JDK you have just to delete the JDK directory:
user@mac: rm -rf /library/Java/JavaVirtualMachines/jdk1.8.0_181



Place the content of JDK-directory in the following directory of your exported RCP product:
YourProduct.app/Contents/Eclipse/jre


Open a shell and verify that the following command works properly:
user@mac: /path/to/YourProduct.app/Contents/Eclipse/jre/
     Contents/Home/jre/bin/java -version


3. Specify VM in product configuration


Modify the file YourProduct.ini and add the following 2 lines:

-vm
../Eclipse/jre/Contents/Home/jre/bin/java

The linebreak between -vm and the path is important!

Since the product is started from YourProduct.app/Contents/MacOS/YourProduct the path ../Eclipse/jre leads to the JDK we provided.

You can integrate the -vm option in your product build via the product configuration editor in tab Launch => launch arguments => macosx => programme arguments. Note that the linebreak is not needed there and is added by the build process.

4. Launch the product to test if everything works until now


Uninstall the JDK (rm -rf /library/Java/JavaVirualMachines/jdk1.8.0_181.jdk) and launch your RCP  product. If everything is correct it will use the JDK you provided. Otherwise it will fail to launch.

5. Remove unneeded files from JDK


You can remove a lot of files and directories from YourProduct.app/Contents/Eclipse/jre folder. You just need:

YourProduct.app/Contents/Eclipse/jre/Home/jre
YourProduct.app/Contents/Eclipse/jre/Info.plist

Remove all other files and directories. Your product will not start anymore but don't worry.

6. Fix Info.plist


The Info.plist contains the option CFBundleExecutable. The setting points to a symbolic link. This link gets broken in the deployment process.

The solution is to simply change CFBundleExecutable:

<key>CFBundleExecutable</key>
<string>../Contents/Home/jre/lib/jli/libjli.dylib</string>


7. Test and deploy


Launch the product to test that everything works as expected. Integrate the deployment of the JRE into your tycho build.

Samstag, 16. Dezember 2017

Simplify text based editing of DocBook XML documentation

Writing and maintaining technical documentation is a really wide topic. Our product System Concept DMS needs a documentation.

Here is what we need:

  1. PDF output with TOC, page-numbers etc.
  2. HTML help page to publish on the web
  3. Files for the Eclipse help system (programme online help)
All this should be generated out of one single source.


I started with docbook about 1,5 years ago. As I am a programmer I have no difficulties editing XML files. But I realized that writing the documentation in plain docbook XML is to much work. I believe that problems will appear conerning image parameters. These are present in each docbook file so changes must be rolled out to all files.

So I left doccbook again and went back to LibreOffice to at least collect knowledge for later.


I reactivated the docbook stuff and the idea is to simplify the editing process by pre processing the source files.

Here is my current draft of a simplified input file:

 <section id="function_objekte_entfernen">  
   
 #height 10cm  
   
 <title>Objekte entfernen</title>  
   
   
 Mit der Aktion #b Objekte entfernen# können mit System Concept DMS aufgebrachte   
 #link Haftnotizen function_haftnotiz#, Markierungen und Schwärzungen aus   
 einem Dokument entfernt werden.  
   
   
 Sie können die Funktion per #b Rechtsklick-Objekte entfernen# direkt aus der   
 Kachelansicht oder aus einem geöffneten Dokument aufrufen.  
    
   
 Es wird der Dialog zum Entfernen von Objekten geöffnet. Markieren Sie den entsprechende Eintrag  
 in der Tabelle und Bestätigen Sie mit #b OK#. Das Objekte wird entfernt und die  
 Dokumentansicht aktualisiert.  
   
 #img img/web/function_objekte_entfernen_1.png 12cm 'Dialog zum Entfernen von Objekten'#  
   
 </section>  


As you can see some DocBook XML elements are still present. Frequently used or complex elements are simplyfied to a #tag # syntax.

Two empty lines create a paragraph-break (</para><para>)

With the #img src title# tag the pre-processing centralizes the XML-representation of images. So if a change is neccessary I just re-generate the XML-files.

I decided for the '#' because shift is not needed. Writing the documentation text must be as easy as possible.

#b: emphasis
#img: mediaobject
#icon: inlinemediaobject
#l: itemizedlist
#li: listitem + para
#-: end sequence (e.g. for #l or #li)

The above example will produce the follwoing DocBook XML section:


 <section id="function_objekte_entfernen">  
   
 <?dbfo-need height="10cm" ?>  
 <title>Objekte entfernen</title>  
   
 <para>  
 Mit der Aktion <emphasis>Objekte entfernen</emphasis> können mit System Concept DMS aufgebrachte   
 <link linkend="function_haftnotiz">Haftnotizen</link>, Markierungen und Schwärzungen aus   
 einem Dokument entfernt werden.  
   
 </para>  
 <para>  
 Sie können die Funktion per <emphasis>Rechtsklick-Objekte entfernen</emphasis> direkt aus der   
 Kachelansicht oder aus einem geöffneten Dokument aufrufen.  
    
 </para>  
 <para>
 Es wird der Dialog zum Entfernen von Objekten geöffnet. Markieren Sie den entsprechende Eintrag  
 in der Tabelle und Bestätigen Sie mit <emphasis>OK</emphasis>. Das Objekte wird entfernt und die  
 Dokumentansicht aktualisiert.  
   
 <mediaobject>  
 <imageobject condition="print">  
 <imagedata fileref="img/web/function_objekte_entfernen_1.png" format="PNG" contentdepth="12cm" />  
 </imageobject>  
 <textobject>  
 <phrase>Dialog zum Entfernen von Objekten</phrase>  
 </textobject>  
 <caption>  
 <para>Dialog zum Entfernen von Objekten</para>  
 </caption>  
 </mediaobject>  
   
   
 </para>  
 </section>  
   

This is much more away from "just writing" and almost two times longer.


The next point will be to divide the source into logical files (sections) and re-combine them for different purposes. This can be done via XML entities.


Dienstag, 30. Mai 2017

Ideas and problems with yellow pin notes on PDF documents

The idea is quite simple: pin this yellow little notes on digital PDF document in an easy-to-use way. You can do it with Adobe Reader but there are some problems:

  • It is not easy to use - e.g. you will have to use "save as" and cannot overwrite the existing document.
  • Adobe Reader is not available on every system
  • There is no way to intergrate the Reader functions  in the System Concept DMS product.

The System Concept DMS software is able to place notes in an easy to use way for about 2 years. But there was no way to remove or edit the notes yet.

The SC DMS features uses Apache PDFBox and draws a note in 3 steps:

  1. yellow box (addRect + fill)
  2. text (beginText + shotText + endText)
  3. border (addrect + stroke)

The user interface provides a two-step assistent to enter the text and choose a position for the note.

PDF document with a nice yellow note pinned


Make it removable

There was customer feedback that it would be great if notes are at least removable. This is not a simple task since a note consists of a number of drawing operations which are not connected in any way within the PDF.
I found a solution for that and use PDF comments (lines beginning with '%') to identify content streams which contain removable objects like notes.

So far so good. It turned out that content streams are put together by a certain page function of PDFBox. This resulted in an empty page if the user removed a note.
The reason was that the note META comment was still in the page but all content has been put into one single stream.

Use annotations

I tried to rewrite the note feature and make use of PDF annotations. Doing some reverse engineering I found out that Adobe Reader produces annotations.

Apache PDFBox is able to manage annotations, too:


PDPage page = doc.getPage(0);
   
List annotations = page.getAnnotations();  
 
PDAnnotationMarkup freeTextMark = new PDAnnotationMarkup();
freeTextMark.setAnnotationName("SCDMS:Note:Peter Pinnau");

freeTextMark.getCOSObject().setName(COSName.SUBTYPE,
   PDAnnotationMarkup.SUB_TYPE_FREETEXT);

freeTextMark.setCreationDate(Calendar.getInstance());
freeTextMark.setAnnotationFlags(4);
   
// Yellow color for background
PDColor yellow = new PDColor(new float[] { 1, 1, 0 }, PDDeviceRGB.INSTANCE);
freeTextMark.setColor(yellow);
  
// Position for the annotation
PDRectangle position = new PDRectangle(); 
   
position.setLowerLeftX(100);
position.setLowerLeftY(200);
position.setUpperRightX(400);
position.setUpperRightY(500);
freeTextMark.setRectangle(position);
   
// set som data
freeTextMark.setTitlePopup("Peter Pinnau");
freeTextMark.setContents("This is the text\nENTER1\nENTER2");
freeTextMark.setPrinted(true);
freeTextMark.setInvisible(false);
   
// Color blaxk, "Helv" font, 11 point
freeTextMark.getCOSObject().setString(COSName.DA, "0 0 0 rg /Helv 11 Tf");
   
// Add the annoation   
annotations.add(freeTextMark);  
  
// Save the document
doc.save(new File("..."));


The above code places a nice multi-line yellow note in the PDF. It is visible and editable in Adobe Reader. It is visible in the PDF viewer shipped with Ubuntu.
But it is NOT visible in Mozillas PDF.JS viewer. Unfortunately SCDMS uses PDF.JS to view PDF documents.

I found out that Apache PDFBox and PDF.JS do not implement a so called default appearance for annotations. Since the annotation has no apperance it is not visible.

Adobe Reader creates a default appearence and displays the annotation correctly. If the PDF is saved ones from Adobe Reader the annotations also become visible in PDF.JS.

There are two open issues concerning that:

PDFJS:
https://github.com/mozilla/pdf.js/issues/6810

PDFBox:
https://issues.apache.org/jira/browse/PDFBOX-2019


The best way to solve this problem concerning SCDMS of course will be to add a correct appearance stream when generating the annotation.
Unfortunately this goes deep into PDF stuff so I hope that PDFBOX-2019 will be solved in the nearer future.

For now I switched back to the old implementation and found another way to do the above mentioned pages operations so that the empty-page-problem could be solved in this particular case.

The content stream merging is done by (page is a page with content from a present document):

PDDocument.importPage(PDPage page)

I now use:

PDDocument.addPage(PDPage page)

and content streams are not put together anymore.







Freitag, 3. März 2017

Noise filter for QR-Code detection in scanned documents

Im am using zxing in a project to detect qr-codes within scanned documents. The goal is to achive almost 100% recognition but there were some issues to solve:


  1. zxing does not find small codes within a document page. Since the qr-code stickers are pinned on the documents manually the user has to pin the sticker in one of the 4 corners.
    The processor than cuts out corner by corner and searches for the code there.
  2. Unfortunately there were still non-recognized codes. Detection relies on printing quality of the stickers which not may be accurate in every situation.
    I did some tests and corrected non-recognized codes with gimp until they worked. I came to the conclusion that a filter is needed to eleminate false pixels as well as possible.



I spend an evening on that and finally found a specialized solution. Take a look and the sample images:

Left: original, Right: filter applied

The result is amazing, isn't it? zxing is now able to recognize the code.

How does it work?


My first idea was to use OpenCV to implement the filter but I than tried a very simple "self-made" algorithm:


  1. Input has to be already black/white pixel-data
  2. Iterate pixel (by rows and columns)
  3. Leave white pixels as they are
  4. For each black pixel calculate black pixels in the surrounding 7x7 square.
  5. Calculate the ration black pixels in 7x7-square / 49
  6. If the ration is less than 0.4 -> set pixel to white

It is important to work on a copy of the input data. The filter must not analyse pixels which have been modified by the algotithm.

Since qr-codes consist of rectangular patterns the filter does almost not destroy real data as long as the stickers are pinned likely straight.
Typical noises from bad printers or scanning failures are reduced very well.

When it goes close to the borders there is no 7x7 square available. It would be possible to leave that areas. I decided to shrink the square according to the position and process data in the same way.

Of course the 7x7 is adjusted to the qr-code size and the scanning resolution.

The following illustration shows the 7x7 square around the current pixel. The result of the black pixel count in that case equals 5 (current pixel no included). The current pixel will be erased and set to white.

Illustration 7x7 square


Make it simplier


A friend of mine pointed out that calculation of the ratio is not necessary. The pixel size of the square is always 7x7 = 49. So the threshold of 0.4 can be pre calculated as 0.4*49 = 20.

Exception: The border areas of the image. The square is shrinked but it is no problem to use the precalculated threshold. The algorithm is than a little more "aggressive" at the image borders (first 3 pixels).


Close areas


Next step is to use the algorithm to close areas. If the threshold as greater than 40 pixels are set to black.

The following image shows the progress. Please enlarge the picture and compare the middle and right sample.  you will see that some white pixels in the data blocks have been closed.

Left: orginal, middle: cleared, right: closed