[Originally posted May 11, 2004]
Performance is always a concern with Web services- there are better ways to designing a performant distributed system than sticking it behind a HTTP port and throwing verbose XML messages at it.
Lately I’ve been exploring some mechanisms to address performance issues for Web services. Most of these approaches target bottlenecks at the lower level SOAP layer, rather than at the design level. I’ve looked at some articles and technical papers that present metrics and guidelines. From them I’ve distilled the following performance best practices:
- Design your Web service interface to minimize the network traffic. A ‘coarse-grained’ API is better, as you minimize the number of requests a client has to make to get information. 
- Large SOAP messages are a performance bottleneck due to time spent parsing them. Keep your payload size as small as possible 
- Complex SOAP message are a performance bottleneck due to time spent serializing/deserializing messages. Keep your payload complexity low. However, payload complexity and payload size are often design tradeoffs. 
- SOAP intermediaries (gateways, proxies) should minimize parsing of messages.
- Better XML parsing techniques.  For most applications, event driven parsers (SAX style) are more performant than DOM style parsers.
- Document/Literal style SOAP messages are smaller and less complex than RPC/SOAP message. 
- Security has performance costs. Not all SOAP traffic needs to be secure. The performance costs of an end-to-end security (i.e. WS-Security) is, in most cases, higher than a transport level security mechanism like SSL. 
- Caching is a way to improve performance for processor-intensive services, though this is applicable only for read-only type of services. 
- Many of the performance best practices for web applications will apply here too (using EJBs v/s JavaBeans, passing-by-reference of EJB components, Hardware and capacity settings, JVM setting etc.) 
- Persistent connections are good for performance in case of a large number of messages of small payload size. For larger messages, this has less of an effect . HTTP keep-alive is way  to request that a HTTP connection persist, though this is a default in HTTP/1.1.
- Streaming connections are good for performance in case of a large payload size. HTTP ‘chunked encoding’ is a kind of streaming, and is supported by HTTP/1.1. 
- Binary encoding of some payload elements should be considered. 
However, all said and done, remember the reasons for which you are choosing Web services – interoperability across heterogeneous environments- and not for performance!
 Holt Adams. Web services performance considerations, Part 1. http://www-106.ibm.com/developerworks/library/ws-best9/
 Holt Adams. Web services performance considerations, Part 2. http://www-106.ibm.com/developerworks/webservices/library/ws-best10/
 Kenneth Chiu, Madhusudhan Govindaraju, et al. Investigating the Limits of SOAP Performance for Scientific Computing. http://www.extreme.indiana.edu/xgws/papers/soap-hpdc2002/soap-hpdc2002.pdf
 Madhusudhan Govindaraju, Aleksander Slominski et al. Requirements for and Evaluation of RMI Protocols for Scientific Computing. http://www.sc2000.org/techpapr/papers/pap.pap261.pdf
 Frank Cohen. Discover SOAP encoding’s impact on Web service performance http://www-106.ibm.com/developerworks/webservices/library/ws-soapenc/
 Dan Davis and Manish Parashar. Latency Performance of SOAP implementations http://www.caip.rutgers.edu/TASSL/Papers/p2p-p2pws02-soap.pdf
[Update May 12, 2004] I found another website that consolidates performance best practices.