The D2 clients were, until recently (D2 version 16.4 included), working by default with the java plugin mode and therefore, many actions were using a JVM on the client workstation. This setting can be controlled/set in the “WEB-INF/classes/settings.properties” file of D2 applications (D2, D2-REST, D2-Smartview) and the default value was therefore “browser.plugin.mode = java“. Starting with D2 16.5, this was changed and the default value is now “browser.plugin.mode = wsctf,thin“, meaning that it will try to use/load the wsctf (Web Socket Content Transfer Framework) and if it’s not possible it will fallback to the thin mode. This blog isn’t an introduction to the browser plugin mode, there would be a LOT to talk about but I needed to introduce a few things so the issue below will be clear/understandable by everyone.
So in a newly build environment on a K8s infrastructure, a user reported some random failures on D2 Export/Download. This specific environment was built in HA in Kubernetes so we had 3 Content Servers pods, 4 D2 pods (WebLogic Managed Server), 2 D2-Config pods (WebLogic Managed Server), and some other components. The session stickiness was handled by the Ingress Controller using Cookies and this was working properly, the HA was also working properly. So, login/logout/browse/search was fully working for all clients but when you started to Export/Download a document, then some randomness appeared. The environment was using D2 16.5.1, it’s important because of the thing I mentioned above: it was therefore using the wsctf plugin (we didn’t change it). At first, I tried to replicate the issue using Google Chrome, log in to D2, find a document, right-click on it and select Export. From what I could see on the D2 logs, it was always working: I tried 20 times in a 10/15 minutes period but each and every time, I could see the download request on the same WebLogic Managed Server that I was logged in. Therefore, on the 4 D2 pods I had, only 1 was used. Removing the cookies, restarting the Google Chrome browser and the same happened for another pod (no issue). So, what’s the issue then? Well actually, I wasn’t using the wsctf plugin on Chrome because it’s usually faster to use the thin mode and therefore that’s how I tried to replicate the issue but couldn’t. The logs of the download using the thin mode were like that:
2020-03-28 11:20:39,226 UTC [INFO ] ([ACTIVE] ExecuteThread: '39' for queue: 'weblogic.kernel.Default (self-tuning)') - com.emc.d2fs.dctm.servlets.D2HttpServlet : Download - START ===================================== 2020-03-28 11:20:39,227 UTC [INFO ] ([ACTIVE] ExecuteThread: '39' for queue: 'weblogic.kernel.Default (self-tuning)') - com.emc.d2fs.dctm.servlets.D2HttpServlet : Download - HTTP Headers Remote : 10.10.10.10 (10-10-10-10.ingress-nginx.ingress-nginx.svc.cluster.local) Locale : en_US Request Protocol : HTTP/1.1 Request Method : GET Context Path : /D2 Request URI : /D2/servlet/Download Request encoding : null Request Parameters : _docbase : Repo1 _locale : en _password : ******** _timeZoneId : Europe/Zurich _username : morgan_patou event_name : d2_export format : hasRelate : 1 id : 090f1234809aa823 uid : Repo1-1585394439230-morgan_patou-401674627 Request Headers : Host : d2.domain.com X-Request-ID : 50a051639dc2a3971cfc3d6cb9010738 X-Real-IP : 10.10.0.1 X-Forwarded-For : 10.10.0.1 X-Forwarded-Host : d2.domain.com X-Forwarded-Port : 443 X-Forwarded-Proto : https X-Original-URI : /D2/servlet/Download?uid=Repo1-1585394439230-morgan_patou-401674627&_docbase=Repo1&_username=morgan_patou&_password=DM_TICKET%3DEwxMwp2gMAoZlOVC...BMOE1HWXlJXCg%3D%3D&_locale=en&_timeZoneId=Europe%2FZurich&id=090f1234809aa823&format=&event_name=d2_export&_timeZoneId=Europe%2FZurich&hasRelate=1 X-Scheme : https upgrade-insecure-requests : 1 user-agent : Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36 sec-fetch-dest : iframe accept : text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9 sec-fetch-site : same-origin sec-fetch-mode : navigate referer : https://d2.domain.com/D2/?docbase=Repo1&locateId=090f1234809aa823 accept-encoding : gzip, deflate, br accept-language : en-US,en;q=0.9 Cookies : lastRepo : Repo1 locale : en_US sessAcv : 1 https%3A%2F%2Fd2.domain.com%2FD2%2Fx3_portal%2Ftheme : %7B%22state%22%3A%7B%22id%22%3A%22s%3Aslate%22%2C%20%22file%22%3A%22s%3Aresources%2Fthemes%2Fslate%2Fcss%2Fxtheme-x3.css%22%7D%7D D2_k8s_env1_sticky : 1585393439.816.1698.878338 JSESSIONID_D2_K8s_env1 : H04MOTyK_nAo4LXdEZlMoeCTMrq94rsx53ds2_3kuhKEG8240V9S!-1569294555 x-auto-271 : %7B%22state%22%3A%7B%22rowSel%22%3A%22i%3A7%22%7D%7D sessExTm : 1585395180150 Http Session : XSRF_TOKEN : 03c0e04906106d97cbe884a7d316deb168b1850959b_1572b22bbb360f265 CONTEXT : {Repo1-1585394439230-morgan_patou-401674627=com.emc.d2fs.models.context.Context@4d3e927a, Repo1-1585393445738-morgan_patou-1618912563=com.emc.d2fs.models.context.Context@589aaba3} org.apache.shiro.web.session.HttpServletSession.HOST_SESSION_KEY : 10-10-10-10.ingress-nginx.ingress-nginx.svc.cluster.local 2020-03-28 11:20:39,228 UTC [INFO ] ([ACTIVE] ExecuteThread: '39' for queue: 'weblogic.kernel.Default (self-tuning)') - com.emc.d2fs.dctm.servlets.D2HttpServlet : Download - Standard Servlet : 2020-03-28 11:20:40,642 UTC [INFO ] ([ACTIVE] ExecuteThread: '39' for queue: 'weblogic.kernel.Default (self-tuning)') - c.e.d.a.c.documentset.D2DocumentSetSwitch : D2DocumentSetSwitch.getDocumentSetList end: 0.000s 2020-03-28 11:20:41,682 UTC [INFO ] ([ACTIVE] ExecuteThread: '39' for queue: 'weblogic.kernel.Default (self-tuning)') - com.emc.d2fs.dctm.servlets.download.Download : downloadFile:Test_document.docx 2020-03-28 11:20:41,682 UTC [INFO ] ([ACTIVE] ExecuteThread: '39' for queue: 'weblogic.kernel.Default (self-tuning)') - com.emc.d2fs.dctm.servlets.download.Download : file stream is not null 2020-03-28 11:20:41,687 UTC [INFO ] ([ACTIVE] ExecuteThread: '39' for queue: 'weblogic.kernel.Default (self-tuning)') - com.emc.d2fs.dctm.servlets.D2HttpServlet : Download - Free memory=242.65753 MB. Total memory=1.0 GB. 2020-03-28 11:20:41,687 UTC [INFO ] ([ACTIVE] ExecuteThread: '39' for queue: 'weblogic.kernel.Default (self-tuning)') - com.emc.d2fs.dctm.servlets.D2HttpServlet : Download - END (2.462s) =====================================
With the above, everything is fine and all communications go through the same D2 pod thanks to the stickiness. So what’s the issue then? To be sure, I tried the same thing with Internet Explorer: login/logout/browse/search was also fully working, all targeting a single pod in the backend. However, when trying to Export/Download, I could see on the D2 logs that the download request was actually triggered to any of the D2 pod, meaning the stickiness was lost for this specific type of request. While looking at the download logs, I could see the following:
2020-03-28 11:23:16,734 UTC [INFO ] ([ACTIVE] ExecuteThread: '49' for queue: 'weblogic.kernel.Default (self-tuning)') - com.emc.d2fs.dctm.servlets.D2HttpServlet : Download - START ===================================== 2020-03-28 11:23:16,735 UTC [INFO ] ([ACTIVE] ExecuteThread: '49' for queue: 'weblogic.kernel.Default (self-tuning)') - com.emc.d2fs.dctm.servlets.D2HttpServlet : Download - HTTP Headers Remote : 10.10.10.10 (10-10-10-10.ingress-nginx.ingress-nginx.svc.cluster.local) Locale : en_US Request Protocol : HTTP/1.1 Request Method : GET Context Path : /D2 Request URI : /D2/servlet/Download Request encoding : null Request Parameters : _docbase : Repo1 _locale : en _password : ******** _timeZoneId : Europe/Zurich _username : morgan_patou event_name : d2_export format : hasRelate : 1 id : 090f1234809aa823 uid : Repo1-1585394596738-morgan_patou-1829106889 Request Headers : Host : d2.domain.com X-Request-ID : 267f10747ab4e6df0b0b74c6be0e2f95 X-Real-IP : 10.10.0.1 X-Forwarded-For : 10.10.0.1 X-Forwarded-Host : d2.domain.com X-Forwarded-Port : 443 X-Forwarded-Proto : https X-Original-URI : /D2/servlet/Download?uid=Repo1-1585394596738-morgan_patou-1829106889&_docbase=Repo1&_username=morgan_patou&_password=DM_TICKET%3DEwxMwp2gMAoZlOVC...2UFK01ODR6dCg%3D%3D&_locale=en&_timeZoneId=Europe%2FZurich&id=090f1234809aa823&format=&event_name=d2_export&_timeZoneId=Europe%2FZurich&hasRelate=1 X-Scheme : https User-Agent : python-requests/2.21.0 Accept-Encoding : gzip, deflate Accept : */* Http Session : 2020-03-28 11:23:16,735 UTC [INFO ] ([ACTIVE] ExecuteThread: '49' for queue: 'weblogic.kernel.Default (self-tuning)') - com.emc.d2fs.dctm.servlets.D2HttpServlet : Download - Standard Servlet : 2020-03-28 11:23:18,044 UTC [ERROR] ([ACTIVE] ExecuteThread: '49' for queue: 'weblogic.kernel.Default (self-tuning)') - com.emc.d2fs.dctm.web.services.D2fsContext : {} com.documentum.fc.client.DfServiceException: [DM_SESSION_E_LDAP_CHGPASS_USER]error: "The changepassword API failed with the following error: morgan_patou" at com.documentum.fc.client.impl.docbase.DocbaseExceptionMapper.newException(DocbaseExceptionMapper.java:57) at com.documentum.fc.client.impl.connection.docbase.MessageEntry.getException(MessageEntry.java:39) at com.documentum.fc.client.impl.connection.docbase.DocbaseMessageManager.getException(DocbaseMessageManager.java:137) at com.documentum.fc.client.impl.connection.docbase.netwise.NetwiseDocbaseRpcClient.checkForMessages(NetwiseDocbaseRpcClient.java:329) at com.documentum.fc.client.impl.connection.docbase.netwise.NetwiseDocbaseRpcClient.applyForObject(NetwiseDocbaseRpcClient.java:672) at com.documentum.fc.client.impl.connection.docbase.DocbaseConnection$8.evaluate(DocbaseConnection.java:1382) ... at weblogic.work.ExecuteThread.run(ExecuteThread.java:360) 2020-03-28 11:23:18,046 UTC [ERROR] ([ACTIVE] ExecuteThread: '49' for queue: 'weblogic.kernel.Default (self-tuning)') - c.emc.d2fs.dctm.aspects.InjectSessionAspect : {} com.documentum.fc.client.DfServiceException: [DM_SESSION_E_LDAP_CHGPASS_USER]error: "The changepassword API failed with the following error: morgan_patou" at com.documentum.fc.client.impl.docbase.DocbaseExceptionMapper.newException(DocbaseExceptionMapper.java:57) ... at weblogic.work.ExecuteThread.run(ExecuteThread.java:360) 2020-03-28 11:23:18,046 UTC [WARN ] ([ACTIVE] ExecuteThread: '49' for queue: 'weblogic.kernel.Default (self-tuning)') - com.emc.d2fs.dctm.servlets.download.Download : error in downloading file:com.emc.d2fs.exceptions.D2fsException: [DM_SESSION_E_LDAP_CHGPASS_USER]error: "The changepassword API failed with the following error: morgan_patou" 2020-03-28 11:23:18,056 UTC [ERROR] ([ACTIVE] ExecuteThread: '49' for queue: 'weblogic.kernel.Default (self-tuning)') - com.emc.d2fs.dctm.servlets.D2HttpServlet : Download failed com.emc.d2fs.exceptions.D2fsException: [DM_SESSION_E_LDAP_CHGPASS_USER]error: "The changepassword API failed with the following error: morgan_patou" at com.emc.d2fs.dctm.aspects.InjectSessionAspect.exceptionHandling(InjectSessionAspect.java:347) at com.emc.d2fs.dctm.aspects.InjectSessionAspect.ajc$inlineAccessMethod$com_emc_d2fs_dctm_aspects_InjectSessionAspect$com_emc_d2fs_dctm_aspects_InjectSessionAspect$exceptionHandling(InjectSessionAspect.java:1) at com.emc.d2fs.dctm.aspects.InjectSessionAspect.process(InjectSessionAspect.java:245) at com.emc.d2fs.dctm.web.services.download.D2DownloadService.getDownloadObjectId(D2DownloadService.java:703) at com.emc.d2fs.dctm.servlets.download.Download.processRequest(Download.java:166) at com.emc.d2fs.dctm.servlets.D2HttpServlet.execute(D2HttpServlet.java:244) ... 2020-03-28 11:23:18,058 UTC [INFO ] ([ACTIVE] ExecuteThread: '49' for queue: 'weblogic.kernel.Default (self-tuning)') - com.emc.d2fs.dctm.servlets.D2HttpServlet : Download - Free memory=560.1069 MB. Total memory=1.0 GB. 2020-03-28 11:23:18,058 UTC [INFO ] ([ACTIVE] ExecuteThread: '49' for queue: 'weblogic.kernel.Default (self-tuning)') - com.emc.d2fs.dctm.servlets.D2HttpServlet : Download - END (1.324s) =====================================
That’s pretty different compared to the content of the download from Google Chrome, isn’t it? If you look carefully, you can see that there are Cookies in particular aren’t in the request made… One might think that’s it’s just IE that doesn’t provide it while Chrome does but actually, that’s not it. D2 is working in the same way (on this aspect at least…) for both Chrome and IE. In this case, the content of the log was different because for IE, the wsctf plugin was used. Therefore, the issue in this case is caused by the fact that the wsctf plugin doesn’t send the cookies in the download request while the thin mode does. That’s a bug, it shouldn’t happen. Initially, I wanted to open a SR with the OpenText Support to ask whether this is the expected behavior. However, a few minutes later, I found another related issue (see this blog) which clearly showed that it’s not.
Since there is no cookies, the Ingress Controller doesn’t know which D2 pod should receive the request and therefore it’s a random distribution. Obviously, this isn’t just for an Ingress Controller, you would face exactly the same thing with a standard Load Balancer that is configured to use cookies stickiness.
Is this really the issue? Well, not mandatorily… I mean yes it can be considered as the issue as the behavior is different between the wsctf and the thin mode, which in itself shouldn’t happen. However, the download request normally doesn’t need the session details because it’s a “standalone” request which could therefore normally be processed by any of the D2 pod running. There is one parameter that is often forgotten when setting up D2 and that is specific to the HA setup: the “loadBalancedContentServer” parameter inside the “WEB-INF/classes/D2FS.properties” file. This parameter is by default set to false but if you have more than one Content Server (so HA on the Repo layer), then you are supposed to set it to true. It’s often forgotten because keeping it to false will usually not cause much trouble, except under very specific circumstances. This parameter wasn’t set properly in this environment, so I just fixed it to solve this specific issue:
[weblogic@wsd2-0 ~]$ cd $APPLICATIONS [weblogic@wsd2-0 apps]$ [weblogic@wsd2-0 apps]$ d2fs="D2/WEB-INF/classes/D2FS.properties" [weblogic@wsd2-0 apps]$ grep loadBalancedContentServer ${d2fs} loadBalancedContentServer=false [weblogic@wsd2-0 apps]$ [weblogic@wsd2-0 apps]$ sed -i 's,^loadBalancedContentServer=.*,loadBalancedContentServer=true,' ${d2fs} [weblogic@wsd2-0 apps]$ [weblogic@wsd2-0 apps]$ grep loadBalancedContentServer ${d2fs} loadBalancedContentServer=true [weblogic@wsd2-0 apps]$
Once it’s corrected, redeploy D2 and then you should still see the same behavior in terms of wsctf download requests, meaning that the requests will be sent to any of the D2 pods with cookies. However, thanks to this parameter, the download requests should all succeed, even if the D2 pod handling it isn’t the one you are currently logged into and that’s because the login ticket is now global so it will be accepted by all Content Servers, no matter which one is used.