Linking Pentaho Business Intelligence Server 5.0 Community Edition to Active Directory

So.. You’ve installed Pentaho BI Server 5 CE, and have followed their documentation for the CE and even the Enterprise Edition, looked high and low for solutions and blog and forum posts but it’s simply not working.

Some errors are in the log to throw you off course, like “validation.properties could not be loaded by any means.” from the ESAPI authentication system. This is actually harmless, and probably even to be expected.

Adding extra logging will help you debug as documented on the Pentaho CE Wiki, but this is definitely not as easy as it all should be.

I understand the Enterprise version has a nice Web GUI (like Atlassian’s products, and like Pentaho’s mail server configuration) which makes this a lot easier for you. Alas, for the people still testing this BI solution it’d be nice if this worked, too.

Luckily I have some knowledge of Java and frameworks and was able to find a lot more information by just looking at documentation of the frameworks Pentaho uses, such as Spring, more specifically their LDAP documentation. They seem to have a specific AD authenticator these days, but it’s not included with Pentaho.

So, after a few hours of cursing, here’s the way to do this. It was tested against a Samba4 AD domain, which should be 100% compatible with a Windows 2008 one.

Step one: Roll back everything you broke by following their wiki documentation. It’s all wrong, and will make you break all sorts of stuff which is not needed at all.

Step two: Fill out the appropriate properties files as detailed below.

Set the LDAP connection data. A user to bind to the directory is needed to be able to look up user DNs. I’ve created a group “Pentaho Administrators”; members of this group have administrative rights inside the BI server system. Note that their configuration panel calls this group “Administrator” no matter its name. Somewhat confusing.

The enterprise documentation mentions a special but well-known OID you can use to do recursive group searches, but it doesn’t work against Samba4 (at least the version I’m running) so I have not configured this.

biserver-ce/pentaho-solutions/system/applicationContext-security-ldap.properties

contextSource.providerUrl=ldaps\://yourdc.example.com\:636
contextSource.userDn=cn\=binduser,cn\=Users,dc\=your,dc\=domain,dc\=root
contextSource.password=bindpassword

userSearch.searchBase=cn\=Users,dc\=your,dc\=domain,dc\=root
userSearch.searchFilter=(sAMAccountName\={0})

populator.convertToUpperCase=false
populator.groupRoleAttribute=cn
populator.groupSearchBase=ou\=Groups,dc\=your,dc\=domain,dc\=root
populator.groupSearchFilter=(member\={0})
populator.rolePrefix=
populator.searchSubtree=true

allAuthoritiesSearch.roleAttribute=cn
allAuthoritiesSearch.searchBase=ou\=Groups,dc\=your,dc\=domain,dc\=root
allAuthoritiesSearch.searchFilter=(objectClass\=group)

allUsernamesSearch.usernameAttribute=sAMAccountName
allUsernamesSearch.searchBase=cn\=Users,dc\=your,dc\=domain,dc\=root
allUsernamesSearch.searchFilter=objectClass\=Person

adminRole=cn\=Pentaho Administrators,ou=Groups,dc\=your,dc\=domain,dc\=root
adminUser=cn\=Administrator,cn\=Users,dc\=your,dc\=domain,dc\=root

We can easily switch over to LDAP authentication:

biserver-ce/pentaho-solutions/system/security.properties

provider=ldap

The only thing that is somewhat odd is that it doesn’t see any groups below my Ou=Groups tree, even though searchSubtree is set to true.

Finally, to avoid the heaps of “User admin not found in directory.” (nested) exceptions in the logs, change the “admin” username for the Spring framework to “Administrator”. Somehow Pentaho requires this user to do … I don’t know. Perhaps only to nag about it. I’ve changed it to “Administrator” as I don’t have a user called “admin”. You could create one, of course, but what for… It seemed to work without this change, but the logs were bombarded with error messages.

biserver-ce/pentaho-solutions/system/repository.spring.properties

singleTenantAdminDefaultUserName=Administrator
singleTenantAdminUserName=Administrator
singleTenantAdminDefaultAuthorityName=Administrator
singleTenantAdminAuthorityName=Administrator
repositoryAdminUsername=pentahoRepoAdmin
singleTenantAuthenticatedAuthorityName=Authenticated
singleTenantAnonymousAuthorityName=Anonymous
superAdminAuthorityName=SysAdmin
superAdminUserName=super
systemTenantAdminUserName=system
systemTenantAdminPassword=cGFzc3dvcmQ=

I hope this helps you, if you’re stuck on this. LDAP/AD shouldn’t just be for the paid edition – in Observium we fully support this in the CE as well 😉

It took a few hours to debug all this, it’s a shame that their documentation sends you in the worst direction possible, as in the end, getting it to work is very easy and just means you need to fill out a few properties with the correct data.

Unfortunately I didn’t really keep track of all the weird errors I got when trying to get this to work – I would add them as extra Google bait if I did.

Update: it may seem unimportant, but it certainly is not: make sure Administrator (or whatever user you chose to use above) is part of the Pentaho Administrators group you configured! Otherwise the application seems to work OK at first sight, but storing/clearing recent files, creating data sources, etc will not work correctly and produce the following (or similar) error in your logs:

ERROR [BackingRepositoryLifecycleManagerAuthenticationSuccessListener] Access denied to this data; nested exception is javax.jcr.AccessDeniedException: Access denied.
org.springframework.security.AccessDeniedException: Access denied to this data; nested exception is javax.jcr.AccessDeniedException: Access denied.

Writing informative technical how-to documentation takes time, dedication and knowledge. Should my blog series have helped you in getting things working the way you want them to, or configure certain software step by step, feel free to tip me via PayPal (paypal@powersource.cx) or the Flattr button. Thanks!