Application Server Solutions for Microsoft IIS and ASP.NET
       solutions   products   partners   company   support   downloads         store
View Interest List Message Details
<< Back to Search Results

Date: 11/07/2007
From: GSL@syscomworld.com
Subject: RE: SV: SV: [ServletExec] Character encoding issue with POST parameters
You are absolutely correct. It is not directly related to HTTP POST though, but 
to the content-type used by default when posting,                               
application/x-www-form-urlencoded. Seems kind of obvious now... :)

I've tried to search for this, but can not find a single mention of being  
required to URL-encode POST parameters outside of the HTTP spec. I find that a  
bit odd, considering the AJAX-revolution we're in I would think of this as 
being a more common mistake. This leads me to believe that you might be one of  
only a few with such a strict behavior, and that this issue deserves its very   
own FAQ entry.

Thank you so much for your help.

Regards,
Glenn

________________________________

From: servletexec-interest-owner@newatlanta.com on behalf of                    
mmcginty@newatlanta.com
Sent: Tue 06.11.2007 19:50
To: servletexec-interest@newatlanta.com
Subject: Re: SV: SV: [ServletExec] Character encoding issue with POST           
parameters


Hmmm...

I typically use the Firefox plugin named "Live HTTP Headers".

I just used it and it showed me that the HTML form sends these POST args:
ta=%C3%A6%C3%B8%C3%A5%C3%86%C3%98%C3%85

(which is 39 characters long).

It also showed me that your JavaScript sends these POST args:
ta=æøå?~...

So it seems to me that your JavaScript client gets the Content-Length value     
wrong because it is not URL-encoding the POST args prior to sending them to the 
servlet.

Have your JavaScript client URL-encode the post args (or pass your javascript   
code, args that have been URL-encoded).
I suppose then the JavaScript will send the correct data along with the correct 
content-length.

Tomcat may be more lenient in this area, but my understanding of the HTTP Spec  
is that one *must* URL-encode the very sort of characters that you are trying   
to POST. A quick google search took me here:

http://www.blooberry.com/indexdot/html/topics/urlencoding.htm

I also found this page helpful whilst delving into your issue:
http://www1.tip.nl/~t876506/utf8tbl.html

I hope that helps.

Matt McGinty
Software Engineer
New Atlanta Communications, LLC
http://www.newatlanta.com <http://www.newatlanta.com/>  


Glenn Slotte wrote: 

You're right about the letter 'a'. That's a line break I    
removed to simplfy the output.
 
You have made some very good reasoning, but you seem to assume that the content 
is equal in both cases. From what I can tell, 15 is the correct length. There   
are three single-byte characters ("ta=") and six double-byte          
characters, adding up to 3+(6*2) = 3+12 = 15 bytes. It looks like submitting a  
form might add some more content.
 
I'm not at work right now, so I can't test this myself. If you        
don't have the opportunity to look at this today, I'll look into it   
tomorrow.
 
By the way, there is a firefox extension called Firebug, that allows you to     
easily monitor HTTP requests.
 
Glenn 

________________________________

Fra: servletexec-interest-owner@newatlanta.com p vegne av                      
mmcginty@newatlanta.com
Sendt: ti 06.11.2007 17:15
Til: servletexec-interest@newatlanta.com
Emne: Re: SV: [ServletExec] Character encoding issue with POST parameters


Glenn,

Short answer:
 Your JavaScript client is not sending the correct value for the Content-Length 
request Header.

Long answer:

I took the pieces of your test and put them into a web application so that I    
could try running your JSP inside SE.
Since your POST request is sent to a servlet that you made (which you named     
"com.syscom.test.TestServlet"), I first had to compile that servlet.
	
I requested your test.jsp and saw the text area in the response in my browser.
	That text area was pre-populated with 6 characters.
I clicked the "send" button and received a response from SE saying    
that the webapp could not find a file or servlet for the requested URL.

So I looked at the source of your test.jsp and noticed that you are not using   
an HTML Form.
When one clicks your"send" button, a JavaScript method is fired.
That JavaScript method extracts the contents of the textarea (the 6 characters) 
and passes them to another JavaScript function:
 YAHOO.util.Connect.asyncRequest()

which makes the POST request to merely "TestServlet".

So I configured that servlet inside the webapp's web.xml file and then     
mapped an exact alias of /TestServlet to that configured servlet.
Then when I re-posted the 6 characters... I saw the following response in my    
browser:

  ffffffe6 fffffff8 ffffffe5 3f 3f 3f 3f 3f 3f | ffffffc6 ffffffd8 ffffffc5

So... once I got the pieces in place and configured properly I was able to see  
the behavior you describe.

Now...

Since the actual client to the servlet (in your test) is JavaScript (not the    
browser), I first wanted to simplify the problem by taking the JavaScript       
completely out of the picture.
I did this by adding a simple HTML form at the bottom of your test.jsp page.
It too uses a pre-populated text area and has a send/submit button.
However when one clicks that form's submit button, no JavaScript is used   
at all.

Instead that form simply POSTs to the same servlet (your TestServlet).
The response I get then seems much better:

   ffffffe6 fffffff8 ffffffe5 ffffffc6 ffffffd8 ffffffc5 | ffffffc6 ffffffd8    
ffffffc5


notice that characters #4, #5, and #6 are now correct:
  ffffffc6 ffffffd8 ffffffc5

(I don't get the letter 'a' that you showed in a previous email  
as being part of "Expected output"... I'm not sure that the      
letter 'a' should be expected)

So I'd say the issue has something to do with the JavaScript               
"layer" of your webapp.
What that something is and whether or not it's a bug in ServletExec I      
don't know at this point.

When your "form" is used (the one that uses JavaScript) it both sends 
the POST request to the servlet and also displays the response in the browser.
	So I wondered if maybe the problem was in the display... perhaps something     
that the JavaScript is doing wrong when it displays the response in the         
browser.
So I added a line to your servlet to have it write the response to the          
ServletExec.log file.
I recompiled and re-requested, then used the SE Admin UI to view the contents   
of the ServletExec.log file.
The value written to that file was the same as that displayed in the browser... 
so I'd guess that the problem is occurring *prior* to the displaying of    
the response.

I ran your modified webapp inside SE AS w/ the built-in webserver.
I turned on the "Request" and "AS" levels of debugging      
within that webserver (so that the request headers would be recorded in the     
ServletExec.log file).
Using your JavaScript to POST the characters to your servlet gave this output   
in ServletExec.log:

-----
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-AS): request header [Host] =   
[localhost]
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-AS): request header            
[User-Agent] = [Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.9)     
Gecko/20071025 Firefox/2.0.0.9]
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-AS): request header [Accept] = 
[text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8
,image/png,*/*;q=0.5]
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-AS): request header            
[Accept-Language] = [en-us,en;q=0.5]
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-AS): request header            
[Accept-Encoding] = [gzip,deflate]
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-AS): request header            
[Accept-Charset] = [ISO-8859-1,utf-8;q=0.7,*;q=0.7]
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-AS): request header            
[Keep-Alive] = [300]
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-AS): request header            
[Connection] = [keep-alive]
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-AS): request header            
[X-Requested-With] = [XMLHttpRequest]
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-AS): request header            
[Content-Type] = [application/x-www-form-urlencoded; charset=UTF-8]
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-AS): request header [Referer]  
= [http://localhost/glennSlotte/]
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-AS): request header            
[Content-Length] = [15]
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-AS): request header [Cookie] = 
[JSESSIONID=QolutmgkfkzL7fPOKogmLfK-6wI;                                        
JSESSIONID=W74GlBiAlnwqkBVaXltW5y6oEdI]
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-AS): request header [Pragma] = 
[no-cache]
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-AS): request header            
[Cache-Control] = [no-cache]
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-AS): request                   
[POST],[/glennSlotte/TestServlet?fix=0.6515372392296425],[HTTP/1.1]
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-Request): begin processing uri 
- /glennSlotte/TestServlet
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-Request): check security for - 
/TestServlet
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-Request): begin internal       
forward - /TestServlet
[Tue Nov 06 10:59:37 EST 2007] output from Glenn Slotte's TestServlet=|    
ffffffe6 fffffff8 ffffffe5 3f 3f 3f 3f 3f 3f | ffffffc6 ffffffd8 ffffffc5|
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-Request): end internal forward 
- /TestServlet
[Tue Nov 06 10:59:37 EST 2007] ServletExec(DEBUG-Request): end processing uri - 
/glennSlotte/TestServlet - in 00.01 seconds.
-----

Using my simple form to POST the same characters to the same servlet gave this  
ouput in ServletExec.log:

----
[Tue Nov 06 10:59:51 EST 2007] ServletExec(DEBUG-AS): request header [Host] =   
[localhost]
[Tue Nov 06 10:59:51 EST 2007] ServletExec(DEBUG-AS): request header            
[User-Agent] = [Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.9)     
Gecko/20071025 Firefox/2.0.0.9]
[Tue Nov 06 10:59:51 EST 2007] ServletExec(DEBUG-AS): request header [Accept] = 
[text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8
,image/png,*/*;q=0.5]
[Tue Nov 06 10:59:51 EST 2007] ServletExec(DEBUG-AS): request header            
[Accept-Language] = [en-us,en;q=0.5]
[Tue Nov 06 10:59:51 EST 2007] ServletExec(DEBUG-AS): request header            
[Accept-Encoding] = [gzip,deflate]
[Tue Nov 06 10:59:51 EST 2007] ServletExec(DEBUG-AS): request header            
[Accept-Charset] = [ISO-8859-1,utf-8;q=0.7,*;q=0.7]
[Tue Nov 06 10:59:51 EST 2007] ServletExec(DEBUG-AS): request header            
[Keep-Alive] = [300]
[Tue Nov 06 10:59:51 EST 2007] ServletExec(DEBUG-AS): request header            
[Connection] = [keep-alive]
[Tue Nov 06 10:59:51 EST 2007] ServletExec(DEBUG-AS): request header [Referer]  
= [http://localhost/glennSlotte/]
[Tue Nov 06 10:59:51 EST 2007] ServletExec(DEBUG-AS): request header [Cookie] = 
[JSESSIONID=QolutmgkfkzL7fPOKogmLfK-6wI;                                        
JSESSIONID=W74GlBiAlnwqkBVaXltW5y6oEdI]
[Tue Nov 06 10:59:51 EST 2007] ServletExec(DEBUG-AS): request header            
[Content-Type] = [application/x-www-form-urlencoded]
[Tue Nov 06 10:59:51 EST 2007] ServletExec(DEBUG-AS): request header            
[Content-Length] = [39]
[Tue Nov 06 10:59:51 EST 2007] ServletExec(DEBUG-AS): request                   
[POST],[/glennSlotte/TestServlet],[HTTP/1.1]
[Tue Nov 06 10:59:51 EST 2007] ServletExec(DEBUG-Request): begin processing uri 
- /glennSlotte/TestServlet
[Tue Nov 06 10:59:51 EST 2007] ServletExec(DEBUG-Request): check security for - 
/TestServlet
[Tue Nov 06 10:59:51 EST 2007] ServletExec(DEBUG-Request): begin internal       
forward - /TestServlet
[Tue Nov 06 10:59:51 EST 2007] output from Glenn Slotte's TestServlet=|    
ffffffe6 fffffff8 ffffffe5 ffffffc6 ffffffd8 ffffffc5 | ffffffc6 ffffffd8       
ffffffc5|
[Tue Nov 06 10:59:51 EST 2007] ServletExec(DEBUG-Request): end internal forward 
- /TestServlet
[Tue Nov 06 10:59:51 EST 2007] ServletExec(DEBUG-Request): end processing uri - 
/glennSlotte/TestServlet - in 00.01 seconds.
----

In the failing case, the content length is only 15.
In the passing case it is 39.

So unless I've missed something, it seems to me that your JavaScript       
client may not be doing the right thing.
I've attached the webapp I used.
Please let me know what you find.
 

Matt McGinty
Software Engineer
New Atlanta Communications, LLC
http://www.newatlanta.com <http://www.newatlanta.com/>                    
<http://www.newatlanta.com/> <http://www.newatlanta.com/>   


Glenn Slotte wrote: 

	It must have been removed by some email filter. I got several replies from     
those :)
	 
	I'll try to send it to you directly.
	
	________________________________
	
	From: servletexec-interest-owner@newatlanta.com on behalf of                   
mmcginty@newatlanta.com
	Sent: Tue 06.11.2007 14:57
	To: servletexec-interest@newatlanta.com
	Subject: Re: SV: [ServletExec] Character encoding issue with POST parameters
	
	
	Glenn,
	There was no example attached.
	Please resend.
	
	Matt McGinty
	Software Engineer
	New Atlanta Communications, LLC
	http://www.newatlanta.com <http://www.newatlanta.com/>                   
<http://www.newatlanta.com/> <http://www.newatlanta.com/>           
<http://www.newatlanta.com/> <http://www.newatlanta.com/>           
<http://www.newatlanta.com/> <http://www.newatlanta.com/>    
	
	
	Glenn Slotte wrote: 
	
		I tried applying the hotfix, but no help there unfortunately. I've also  
checked the tomcat config, but can not see anything that should affect this.
		 
		Attached is a simple example that demonstrates the problem.
		 
		Expected output:  ffffffe6 fffffff8 ffffffe5 a ffffffc6 ffffffd8 ffffffc5 |   
ffffffc6 ffffffd8 ffffffc5
		Current output:  ffffffe6 fffffff8 ffffffe5 3f 3f 3f 3f 3f 3f | ffffffc6      
ffffffd8 ffffffc5
		 
		Glenn
		
		________________________________
		
		From: servletexec-interest-owner@newatlanta.com on behalf of                  
mmcginty@newatlanta.com
		Sent: Mon 05.11.2007 21:21
		To: servletexec-interest@newatlanta.com
		Subject: Re: SV: [ServletExec] Character encoding issue with POST parameters
			
		
		Glenn,
		
		The case of some letters being encoded properly while others are not seems    
quite odd to me... 
		
		Are you using a browser and an HTML form to POST the request parameters?
		Or are you using some sort of custom/non-standard client to send the request  
parameters?
		
		To what are you sending the request parameters (a Servlet or a JSP)?
		
		Are you printing the hex string to the browser? or to Standard out            
[System.out.println() ] or to some other file?
		
		With SE, a call to request.getParameter() returns (by default) a String       
that's been  
		encoded using the 8859_1 charset.
		
		Also... I know that the request encoding can be configured globally in        
tomcat.
		At least I know this is true with Tomcat 6.0.10 where it is configured in     
tomcat's              
		/conf/server.xml file by changing (*for example*):
		
		 <Connector port="8080"                        
		protocol="HTTP/1.1" 
		              maxThreads="150"                    
		connectionTimeout="20000" 
		              redirectPort="8443" />
		
		to this:
		
		   <Connector port="8080"                      
		protocol="HTTP/1.1"
		              maxThreads="150"                    
		connectionTimeout="20000"
		              redirectPort="8443"
		              URLEncoding="UTF-8" />
		
		
		So maybe check your Tomcat settings as all things may not be set the same     
between Tomcat and SE.
		Granted... using request.setCharacterEncoding() should override that... but   
it might still be prudent to understand any configuration differences that may  
be present in the 2 engine brands you are using.
		
		And maybe consider applying the latest SE 5 hotfix (see SE FAQ #195) and      
seeing if doing *only* that changes the behavior any.
		The latest one right now is the Sept 2007 hotfix.
		It includes fixes for the following encoding-related bugs:
		
		 - bug #2283: Request.getCharacterEncoding() sometimes returns null when it   
should not
		   (this can occur if the browser/client fails to send the request header     
named "Content Type")
		
		 - bug #1430: Character encoding not being honored for the body of a custom   
tag
		
		which may or may not be impacting you.
		
		If you are POSTing to a JSP then it may be prudent to "touch" it    
(add a harmless space and then remove it and re-save the file) before you       
re-request it.
		If that does not change the behavior at all then:
		
		 1. if you are POSTing to a JSP, does it use the jsp:include tag or the       
jsp:forward tag?
		 2. can you send a simple example which I could run here to see the problem   
myself?
		
		
		Also... consider that the Sept 2007 hotfix includes a fix for the following:
			------
		- bug #2792: Unable to configure the default request encoding
		    This is now configurable by passing a System property to the JVM at JVM   
startup time.
		    For example to change the default request encoding from 8859_1 to UTF-8:
			    -Dcom.newatlanta.servletexec.request.url.encoding=UTF-8
		------
		
		
		So maybe try using that fix to see if it makes any difference in the          
behavior. 
		Matt McGinty
		Software Engineer
		New Atlanta Communications, LLC
		http://www.newatlanta.com <http://www.newatlanta.com/>                  
<http://www.newatlanta.com/> <http://www.newatlanta.com/>           
<http://www.newatlanta.com/> <http://www.newatlanta.com/>           
<http://www.newatlanta.com/> <http://www.newatlanta.com/>           
<http://www.newatlanta..com/> <http://www.newatlanta.com/>          
<http://www.newatlanta.com/> <http://www.newatlanta.com/>           
<http://www.newatlanta.com/> <http://www.newatlanta.com/>           
<http://www.newatlanta.com/> <http://www.newatlanta.com/>     
		
		
		Glenn Slotte wrote: 
		
			Sorry, it's 5.0 of course. We use ISAPI on IIS6.
			 
			For testing purposes, I print the string as hex using getBytes(). This is    
right after getParameter(). 
			 
			Glenn
			
			________________________________
			
			Fra: servletexec-interest-owner@newatlanta.com p vegne av                   
mmcginty@newatlanta.com
			Sendt: ma 05.11.2007 18:37
			Til: servletexec-interest@newatlanta.com
			Emne: Re: [ServletExec] Character encoding issue with POST parameters
			
			
			Glenn,
			
			Sorry for the problem you are having.
			There is no ServletExec 5.5, so please clarify... which version of           
ServletExec are you using?
			The login page of the SE Admin UI should tell you.
			
			And are you using SE ISAPI or SE AS?
			
			Also... where are the characters ending up as question marks?
			In other words... are they being pulled from the request                     
[request.getParameter() ] and then being placed into a database and then        
retrieved from that database?
			Or are they simply being pulled from the request and then displayed in the   
browser?
			
			Glenn Slotte wrote: 
			
				Hi,
				 
				I'm trying to send UTF-8 encoded parameters with POST, but ServletExec 
5.5 fails to recognize certain characters. I use request.setCharacterEncoding(  
"UTF8" ) which helped but there are still some characters that end up 
as question marks.
				 
				The characters that now work are c3 a6, c3 b8 and c3 a5. The characters     
that do not work are c3 86, c3 98 and c3 85, which are simply the capitalized   
version of the former.
				 
				This works fine in Tomcat 5.5, even without manually setting the character  
encoding. It also works fine when manually creating a Java String from a UTF-8  
encoded byte array.
				 
				I appreciate any help you can offer.
				 
				Regards,
				Glenn Slotte
				-------------------------------------------------------------------------
				ServletExec-Interest. For archives and unsubscribe instructions, visit:
				
				    http://www.newatlanta.com/servletexec-interest.jsp
				  
				
				  
			
			  
		
		  
	
	  


Message Thread
Date Subject Author
11/05/2007 SV: SV: [ServletExec] Character encoding issue with POST parameters GSL@syscomworld.com
11/06/2007 SV: SV: [ServletExec] Character encoding issue with POST parameters GSL@syscomworld.com
11/06/2007 Re: SV: SV: [ServletExec] Character encoding issue with POST parameters mmcginty@newatlanta.com
11/07/2007 RE: SV: SV: [ServletExec] Character encoding issue with POST parameters GSL@syscomworld.com
<< Back to Search Results


   
company media information terms of use privacy policy contact us