Java/Network Protocol/URLEncoder

Содержание

1 Calls java.net.URLEncoder.encode(String, String) via reflection, if we are running on JRE 1.4 or later, otherwise reverts to the deprecated URLEncoder.encode(String)method.
2 Converts a String SJIS or JIS URL encoded hex encoding to a Unicode String
3 Decoding and encoding URLs
4 Encode a path as required by the URL specification
5 Implements the "www-form-urlencoded" encoding scheme, also misleadingly known as URL encoding.
6 Parse a x-www-form-urlencoded string
7 Provides a method to encode any string into a URL-safe form
8 Request parsing and encoding utility methods
9 URL Encoder: Encode a string according to RFC 1738.
10 URL Encoder: similar to the java.net.URLEncoder class

Calls java.net.URLEncoder.encode(String, String) via reflection, if we are running on JRE 1.4 or later, otherwise reverts to the deprecated URLEncoder.encode(String)method.

    
/* 
 * JFreeChart : a free chart library for the Java(tm) platform
 * 
 *
 * (C) Copyright 2000-2008, by Object Refinery Limited and Contributors.
 *
 * Project Info:  http://www.jfree.org/jfreechart/index.html
 *
 * This library is free software; you can redistribute it and/or modify it
 * under the terms of the GNU Lesser General Public License as published by
 * the Free Software Foundation; either version 2.1 of the License, or
 * (at your option) any later version.
 *
 * This library is distributed in the hope that it will be useful, but
 * WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
 * or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public
 * License for more details.
 *
 * You should have received a copy of the GNU Lesser General Public
 * License along with this library; if not, write to the Free Software
 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301,
 * USA.
 *
 * [Java is a trademark or registered trademark of Sun Microsystems, Inc.
 * in the United States and other countries.]
 *
 * -----------------
 * URLUtilities.java
 * -----------------
 * (C) Copyright 2007, 2008, by Object Refinery Limited.
 *
 * Original Author:  David Gilbert (for Object Refinery Limited);
 * Contributors:     -;
 *
 * Changes:
 * --------
 * 17-Apr-2007 : Version 1 (DG);
 *
 */

import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.net.URLEncoder;
/**
 * General utility methods for URLs.
 *
 * @since 1.0.6
 */
public class URLUtilities {
    /** Constant used by {@link #encode(String, String)}. */
    private static final Class[] STRING_ARGS_2 = new Class[] {String.class,
            String.class};
    /**
     * Calls <code>java.net.URLEncoder.encode(String, String)</code> via
     * reflection, if we are running on JRE 1.4 or later, otherwise reverts to
     * the deprecated <code>URLEncoder.encode(String)</code> method.
     *
     * @param s  the string to encode.
     * @param encoding  the encoding.
     *
     * @return The encoded string.
     *
     * @since 1.0.6
     */
    public static String encode(String s, String encoding) {
        Class c = URLEncoder.class;
        String result = null;
        try {
            Method m = c.getDeclaredMethod("encode", STRING_ARGS_2);
            try {
                result = (String) m.invoke(null, new Object[] {s, encoding});
            }
            catch (InvocationTargetException e) {
                e.printStackTrace();
            }
            catch (IllegalAccessException e) {
                e.printStackTrace();
            }
        }
        catch (NoSuchMethodException e) {
            // we"re running on JRE 1.3.1 so this is the best we have...
            result = URLEncoder.encode(s);
        }
        return result;
    }
}

Converts a String SJIS or JIS URL encoded hex encoding to a Unicode String

    
/* Copyright 2004 Sun Microsystems, Inc.  All rights reserved.  You may not modify, use, reproduce, or distribute this software except in compliance with the terms of the License at: 
 http://adventurebuilder.dev.java.net/LICENSE.txt
 $Id: I18nUtil.java,v 1.2 2004/05/26 00:07:34 inder Exp $ */
import java.io.ByteArrayOutputStream;
/**
 * This utility class for internationalization. This class provides a central
 * location to do specialized formatting in both a default and a locale specfic
 * manner.
 */
public class Main {
  /**
   * Converts a String SJIS or JIS URL encoded hex encoding to a Unicode String
   * 
   */
  public static String convertJISEncoding(String target) {
    ByteArrayOutputStream bos = new ByteArrayOutputStream();
    if (target == null)
      return null;
    String paramString = target.trim();
    for (int loop = 0; loop < paramString.length(); loop++) {
      int i = (int) paramString.charAt(loop);
      bos.write(i);
    }
    String convertedString = null;
    try {
      convertedString = new String(bos.toByteArray(), "JISAutoDetect");
    } catch (java.io.UnsupportedEncodingException uex) {
    }
    return convertedString;
  }
}

Decoding and encoding URLs

    
/**
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements. See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership. The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License. You may obtain a copy of the License at
 *
 * http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing,
 * software distributed under the License is distributed on an
 * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 * KIND, either express or implied. See the License for the
 * specific language governing permissions and limitations
 * under the License.
 */

import java.io.UnsupportedEncodingException;
import java.net.URLDecoder;
import java.util.logging.Logger;

/**
 * Utility class for decoding and encoding URLs
 *
 */
public final class UrlUtils {
    
    private static final String[] RESERVED_CHARS = {"+"};
    private static final String[] ENCODED_CHARS = {"%2b"};
    
    private UrlUtils() {
        
    }
    /**
     * Decodes using URLDecoder - use when queries or form post values are decoded
     * @param value value to decode
     * @return
     */
    public static String urlDecode(String value) {
        try {
            value = URLDecoder.decode(value, "UTF-8");
        } catch (UnsupportedEncodingException e) {
            System.out.println("UTF-8 encoding can not be used to decode " + value);          
        }
        return value;
    }
    
    /**
     * URL path segments may contain "+" symbols which should not be decoded into " "
     * This method replaces "+" with %2B and delegates to URLDecoder
     * @param value value to decode
     * @return
     */
    public static String pathDecode(String value) {
        // TODO: we actually need to do a proper URI analysis here according to
        // http://tools.ietf.org/html/rfc3986
        for (int i = 0; i < RESERVED_CHARS.length; i++) {
            if (value.indexOf(RESERVED_CHARS[i]) != -1) {
                value = value.replace(RESERVED_CHARS[i], ENCODED_CHARS[i]);
            }
        }
        
        return urlDecode(value);
    }
    
}

Encode a path as required by the URL specification

    
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.util.BitSet;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.Map;
import java.util.StringTokenizer;
/*
 Derby - Class org.apache.derby.iapi.util.PropertyUtil
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
 this work for additional information regarding copyright ownership.
 The ASF licenses this file to you under the Apache License, Version 2.0
 (the "License"); you may not use this file except in compliance with
 the License.  You may obtain a copy of the License at
 http://www.apache.org/licenses/LICENSE-2.0
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
 */
public class Main {

  /**
   * Array containing the safe characters set as defined by RFC 1738
   */
  private static BitSet safeCharacters;
  private static final char[] hexadecimal =
  {"0", "1", "2", "3", "4", "5", "6", "7", "8", "9",
   "A", "B", "C", "D", "E", "F"};
  static {
      safeCharacters = new BitSet(256);
      int i;
      // "lowalpha" rule
      for (i = "a"; i <= "z"; i++) {
          safeCharacters.set(i);
      }
      // "hialpha" rule
      for (i = "A"; i <= "Z"; i++) {
          safeCharacters.set(i);
      }
      // "digit" rule
      for (i = "0"; i <= "9"; i++) {
          safeCharacters.set(i);
      }
      // "safe" rule
      safeCharacters.set("$");
      safeCharacters.set("-");
      safeCharacters.set("_");
      safeCharacters.set(".");
      safeCharacters.set("+");
      // "extra" rule
      safeCharacters.set("!");
      safeCharacters.set("*");
      safeCharacters.set("\"");
      safeCharacters.set("(");
      safeCharacters.set(")");
      safeCharacters.set(",");
      // special characters common to http: file: and ftp: URLs ("fsegment" and "hsegment" rules)
      safeCharacters.set("/");
      safeCharacters.set(":");
      safeCharacters.set("@");
      safeCharacters.set("&");
      safeCharacters.set("=");
  }

  /**
   * Encode a path as required by the URL specification (). This differs from <code>java.net.URLEncoder.encode()</code> which encodes according
   * to the <code>x-www-form-urlencoded</code> MIME format.
   *
   * @param path the path to encode
   * @return the encoded path
   */
  public static String encodePath(String path) {
     // stolen from org.apache.catalina.servlets.DefaultServlet ;)
      /**
       * Note: Here, " " should be encoded as "%20"
       * and "/" shouldn"t be encoded.
       */
      int maxBytesPerChar = 10;
      StringBuffer rewrittenPath = new StringBuffer(path.length());
      ByteArrayOutputStream buf = new ByteArrayOutputStream(maxBytesPerChar);
      OutputStreamWriter writer;
      try {
          writer = new OutputStreamWriter(buf, "UTF8");
      } catch (Exception e) {
          e.printStackTrace();
          writer = new OutputStreamWriter(buf);
      }
      for (int i = 0; i < path.length(); i++) {
          int c = path.charAt(i);
          if (safeCharacters.get(c)) {
              rewrittenPath.append((char)c);
          } else {
              // convert to external encoding before hex conversion
              try {
                  writer.write(c);
                  writer.flush();
              } catch(IOException e) {
                  buf.reset();
                  continue;
              }
              byte[] ba = buf.toByteArray();
              for (int j = 0; j < ba.length; j++) {
                  // Converting each byte in the buffer
                  byte toEncode = ba[j];
                  rewrittenPath.append("%");
                  int low = (toEncode & 0x0f);
                  int high = ((toEncode & 0xf0) >> 4);
                  rewrittenPath.append(hexadecimal[high]);
                  rewrittenPath.append(hexadecimal[low]);
              }
              buf.reset();
          }
      }
      return rewrittenPath.toString();
  }


}

Implements the "www-form-urlencoded" encoding scheme, also misleadingly known as URL encoding.

    
/*
 * Copyright 2001-2004 The Apache Software Foundation.
 * 
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 * 
 *      http://www.apache.org/licenses/LICENSE-2.0
 * 
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */ 

import java.io.ByteArrayOutputStream;
import java.io.UnsupportedEncodingException;
import java.util.BitSet;

/**
 * <p>Implements the "www-form-urlencoded" encoding scheme, 
 * also misleadingly known as URL encoding.</p>
 *  
 * <p>For more detailed information please refer to 
 *  of the 
 * <a href="http://www.w3.org/TR/html4/">HTML 4.01 Specification<a></p>
 * 
 * <p> 
 * This codec is meant to be a replacement for standard Java classes
 * {@link java.net.URLEncoder} and {@link java.net.URLDecoder} 
 * on older Java platforms, as these classes in Java versions below 
 * 1.4 rely on the platform"s default charset encoding.
 * </p>
 * 
 * @author Apache Software Foundation
 * @since 1.2
 * @version $Id: URLCodec.java,v 1.19 2004/03/29 07:59:00 ggregory Exp $
 */
public class URLCodec {
    
    /**
     * The default charset used for string decoding and encoding.
     */
    protected String charset = "UTF8";
    
    protected static byte ESCAPE_CHAR = "%";
    /**
     * BitSet of www-form-url safe characters.
     */
    protected static final BitSet WWW_FORM_URL = new BitSet(256);
    
    // Static initializer for www_form_url
    static {
        // alpha characters
        for (int i = "a"; i <= "z"; i++) {
            WWW_FORM_URL.set(i);
        }
        for (int i = "A"; i <= "Z"; i++) {
            WWW_FORM_URL.set(i);
        }
        // numeric characters
        for (int i = "0"; i <= "9"; i++) {
            WWW_FORM_URL.set(i);
        }
        // special chars
        WWW_FORM_URL.set("-");
        WWW_FORM_URL.set("_");
        WWW_FORM_URL.set(".");
        WWW_FORM_URL.set("*");
        // blank to be replaced with +
        WWW_FORM_URL.set(" ");
    }

    /**
     * Default constructor.
     */
    public URLCodec() {
        super();
    }
    /**
     * Constructor which allows for the selection of a default charset
     * 
     * @param charset the default string charset to use.
     */
    public URLCodec(String charset) {
        super();
        this.charset = charset;
    }
    /**
     * Encodes an array of bytes into an array of URL safe 7-bit 
     * characters. Unsafe characters are escaped.
     *
     * @param urlsafe bitset of characters deemed URL safe
     * @param bytes array of bytes to convert to URL safe characters
     * @return array of bytes containing URL safe characters
     */
    public static final byte[] encodeUrl(BitSet urlsafe, byte[] bytes) 
    {
        if (bytes == null) {
            return null;
        }
        if (urlsafe == null) {
            urlsafe = WWW_FORM_URL;
        }
        
        ByteArrayOutputStream buffer = new ByteArrayOutputStream(); 
        for (int i = 0; i < bytes.length; i++) {
            int b = bytes[i];
            if (b < 0) {
                b = 256 + b;
            }
            if (urlsafe.get(b)) {
                if (b == " ") {
                    b = "+";
                }
                buffer.write(b);
            } else {
                buffer.write("%");
                char hex1 = Character.toUpperCase(
                  Character.forDigit((b >> 4) & 0xF, 16));
                char hex2 = Character.toUpperCase(
                  Character.forDigit(b & 0xF, 16));
                buffer.write(hex1);
                buffer.write(hex2);
            }
        }
        return buffer.toByteArray(); 
    }

    /**
     * Decodes an array of URL safe 7-bit characters into an array of 
     * original bytes. Escaped characters are converted back to their 
     * original representation.
     *
     * @param bytes array of URL safe characters
     * @return array of original bytes 
     * @throws DecoderException Thrown if URL decoding is unsuccessful
     */
    public static final byte[] decodeUrl(byte[] bytes) 
         throws Exception
    {
        if (bytes == null) {
            return null;
        }
        ByteArrayOutputStream buffer = new ByteArrayOutputStream(); 
        for (int i = 0; i < bytes.length; i++) {
            int b = bytes[i];
            if (b == "+") {
                buffer.write(" ");
            } else if (b == "%") {
                try {
                    int u = Character.digit((char)bytes[++i], 16);
                    int l = Character.digit((char)bytes[++i], 16);
                    if (u == -1 || l == -1) {
                        throw new RuntimeException("Invalid URL encoding");
                    }
                    buffer.write((char)((u << 4) + l));
                } catch(ArrayIndexOutOfBoundsException e) {
                    throw new RuntimeException("Invalid URL encoding");
                }
            } else {
                buffer.write(b);
            }
        }
        return buffer.toByteArray(); 
    }

    /**
     * Encodes an array of bytes into an array of URL safe 7-bit 
     * characters. Unsafe characters are escaped.
     *
     * @param bytes array of bytes to convert to URL safe characters
     * @return array of bytes containing URL safe characters
     */
    public byte[] encode(byte[] bytes) {
        return encodeUrl(WWW_FORM_URL, bytes);
    }

    /**
     * Decodes an array of URL safe 7-bit characters into an array of 
     * original bytes. Escaped characters are converted back to their 
     * original representation.
     *
     * @param bytes array of URL safe characters
     * @return array of original bytes 
     * @throws DecoderException Thrown if URL decoding is unsuccessful
     */
    public byte[] decode(byte[] bytes) throws Exception {
        return decodeUrl(bytes);
    }

    /**
     * Encodes a string into its URL safe form using the specified
     * string charset. Unsafe characters are escaped.
     *
     * @param pString string to convert to a URL safe form
     * @param charset the charset for pString
     * @return URL safe string
     * @throws UnsupportedEncodingException Thrown if charset is not
     *                                      supported 
     */
    public String encode(String pString, String charset) 
        throws UnsupportedEncodingException  
    {
        if (pString == null) {
            return null;
        }
        return new String(encode(pString.getBytes(charset)), "US-ASCII");
    }

    /**
     * Encodes a string into its URL safe form using the default string 
     * charset. Unsafe characters are escaped.
     *
     * @param pString string to convert to a URL safe form
     * @return URL safe string
     * @throws EncoderException Thrown if URL encoding is unsuccessful
     * 
     * @see #getDefaultCharset()
     */
    public String encode(String pString) throws Exception {
        if (pString == null) {
            return null;
        }
        try {
            return encode(pString, getDefaultCharset());
        } catch(UnsupportedEncodingException e) {
            throw new RuntimeException(e.getMessage());
        }
    }

    /**
     * Decodes a URL safe string into its original form using the 
     * specified encoding. Escaped characters are converted back 
     * to their original representation.
     *
     * @param pString URL safe string to convert into its original form
     * @param charset the original string charset
     * @return original string 
     * @throws DecoderException Thrown if URL decoding is unsuccessful
     * @throws UnsupportedEncodingException Thrown if charset is not
     *                                      supported 
     */
    public String decode(String pString, String charset) 
        throws Exception, UnsupportedEncodingException 
    {
        if (pString == null) {
            return null;
        }
        return new String(decode(pString.getBytes("US-ASCII")), charset);
    }

    /**
     * Decodes a URL safe string into its original form using the default
     * string charset. Escaped characters are converted back to their 
     * original representation.
     *
     * @param pString URL safe string to convert into its original form
     * @return original string 
     * @throws DecoderException Thrown if URL decoding is unsuccessful
     * 
     * @see #getDefaultCharset()
     */
    public String decode(String pString) throws Exception {
        if (pString == null) {
            return null;
        }
        try {
            return decode(pString, getDefaultCharset());
        } catch(UnsupportedEncodingException e) {
            throw new RuntimeException(e.getMessage());
        }
    }
    /**
     * Encodes an object into its URL safe form. Unsafe characters are 
     * escaped.
     *
     * @param pObject string to convert to a URL safe form
     * @return URL safe object
     * @throws EncoderException Thrown if URL encoding is not 
     *                          applicable to objects of this type or
     *                          if encoding is unsuccessful
     */
    public Object encode(Object pObject) throws Exception {
        if (pObject == null) {
            return null;
        } else if (pObject instanceof byte[]) {
            return encode((byte[])pObject);
        } else if (pObject instanceof String) {
            return encode((String)pObject);
        } else {
            throw new RuntimeException("Objects of type " +
                pObject.getClass().getName() + " cannot be URL encoded"); 
              
        }
    }
    /**
     * Decodes a URL safe object into its original form. Escaped 
     * characters are converted back to their original representation.
     *
     * @param pObject URL safe object to convert into its original form
     * @return original object 
     * @throws DecoderException Thrown if URL decoding is not 
     *                          applicable to objects of this type
     *                          if decoding is unsuccessful
     */
    public Object decode(Object pObject) throws Exception {
        if (pObject == null) {
            return null;
        } else if (pObject instanceof byte[]) {
            return decode((byte[])pObject);
        } else if (pObject instanceof String) {
            return decode((String)pObject);
        } else {
            throw new RuntimeException("Objects of type " +
                pObject.getClass().getName() + " cannot be URL decoded"); 
              
        }
    }

    /**
     * The default charset used for string decoding and encoding.
     *
     * @return the default string charset.
     */
    public String getDefaultCharset() {
        return this.charset;
    }
}

Parse a x-www-form-urlencoded string

     
import java.net.URLDecoder;
import java.net.URLEncoder;
public class Main {
  public static void main(String[] argv) throws Exception {
    String line = URLEncoder.encode("name1", "UTF-8") + "=" + URLEncoder.encode("value1", "UTF-8");
    String[] pairs = line.split("\\&");
    for (int i = 0; i < pairs.length; i++) {
      String[] fields = pairs[i].split("=");
      String name = URLDecoder.decode(fields[0], "UTF-8");
      System.out.println(name);
      String value = URLDecoder.decode(fields[1], "UTF-8");
      System.out.println(value);
    }
  }
}

Provides a method to encode any string into a URL-safe form

    
/**
 * 
 * JFreeReport : a free Java reporting library
 * 
 *
 * Project Info:  http://reporting.pentaho.org/
 *
 * (C) Copyright 2001-2007, by Object Refinery Ltd, Pentaho Corporation and Contributors.
 *
 * This library is free software; you can redistribute it and/or modify it under the terms
 * of the GNU Lesser General Public License as published by the Free Software Foundation;
 * either version 2.1 of the License, or (at your option) any later version.
 *
 * This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;
 * without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 * See the GNU Lesser General Public License for more details.
 *
 * You should have received a copy of the GNU Lesser General Public License along with this
 * library; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330,
 * Boston, MA 02111-1307, USA.
 *
 * [Java is a trademark or registered trademark of Sun Microsystems, Inc.
 * in the United States and other countries.]
 *
 * ------------
 * UTFEncodingUtil.java
 * ------------
 * (C) Copyright 2001-2007, by Object Refinery Ltd, Pentaho Corporation and Contributors.
 */
import java.io.UnsupportedEncodingException;

/**
 * Provides a method to encode any string into a URL-safe form. Non-ASCII characters are first encoded as sequences of
 * two or three bytes, using the UTF-8 algorithm, before being encoded as %HH escapes.
 * <p/>
 * Code is the public example given at http://www.w3.org/International/O-URL-code.html
 *
 * @author Bert Bos
 */
public class UTFEncodingUtil
{
  /**
   * Private Constructor prevents Object Creation.
   */
  private UTFEncodingUtil()
  {
  }
  /**
   * A lookup table.
   */
  private static final String[] hex = {
      "%00", "%01", "%02", "%03", "%04", "%05", "%06", "%07",
      "%08", "%09", "%0A", "%0B", "%0C", "%0D", "%0E", "%0F",
      "%10", "%11", "%12", "%13", "%14", "%15", "%16", "%17",
      "%18", "%19", "%1A", "%1B", "%1C", "%1D", "%1E", "%1F",
      "%20", "%21", "%22", "%23", "%24", "%25", "%26", "%27",
      "%28", "%29", "%2A", "%2B", "%2C", "%2D", "%2E", "%2F",
      "%30", "%31", "%32", "%33", "%34", "%35", "%36", "%37",
      "%38", "%39", "%3A", "%3B", "%3C", "%3D", "%3E", "%3F",
      "%40", "%41", "%42", "%43", "%44", "%45", "%46", "%47",
      "%48", "%49", "%4A", "%4B", "%4C", "%4D", "%4E", "%4F",
      "%50", "%51", "%52", "%53", "%54", "%55", "%56", "%57",
      "%58", "%59", "%5A", "%5B", "%5C", "%5D", "%5E", "%5F",
      "%60", "%61", "%62", "%63", "%64", "%65", "%66", "%67",
      "%68", "%69", "%6A", "%6B", "%6C", "%6D", "%6E", "%6F",
      "%70", "%71", "%72", "%73", "%74", "%75", "%76", "%77",
      "%78", "%79", "%7A", "%7B", "%7C", "%7D", "%7E", "%7F",
      "%80", "%81", "%82", "%83", "%84", "%85", "%86", "%87",
      "%88", "%89", "%8A", "%8B", "%8C", "%8D", "%8E", "%8F",
      "%90", "%91", "%92", "%93", "%94", "%95", "%96", "%97",
      "%98", "%99", "%9A", "%9B", "%9C", "%9D", "%9E", "%9F",
      "%A0", "%A1", "%A2", "%A3", "%A4", "%A5", "%A6", "%A7",
      "%A8", "%A9", "%AA", "%AB", "%AC", "%AD", "%AE", "%AF",
      "%B0", "%B1", "%B2", "%B3", "%B4", "%B5", "%B6", "%B7",
      "%B8", "%B9", "%BA", "%BB", "%BC", "%BD", "%BE", "%BF",
      "%C0", "%C1", "%C2", "%C3", "%C4", "%C5", "%C6", "%C7",
      "%C8", "%C9", "%CA", "%CB", "%CC", "%CD", "%CE", "%CF",
      "%D0", "%D1", "%D2", "%D3", "%D4", "%D5", "%D6", "%D7",
      "%D8", "%D9", "%DA", "%DB", "%DC", "%DD", "%DE", "%DF",
      "%E0", "%E1", "%E2", "%E3", "%E4", "%E5", "%E6", "%E7",
      "%E8", "%E9", "%EA", "%EB", "%EC", "%ED", "%EE", "%EF",
      "%F0", "%F1", "%F2", "%F3", "%F4", "%F5", "%F6", "%F7",
      "%F8", "%F9", "%FA", "%FB", "%FC", "%FD", "%FE", "%FF"
  };
  /**
   * Encode a string according to RFC 1738.
   * <p/>
   * <quote> "...Only alphanumerics [0-9a-zA-Z], the special characters "$-_.+!*"()," [not including the quotes - ed],
   * and reserved characters used for their reserved purposes may be used unencoded within a URL."</quote>
   * <p/>
   * <ul> <li><p>The ASCII characters "a" through "z", "A" through "Z", and "0" through "9" remain the same.
   * <p/>
   * <li><p>The unreserved characters - _ . ! ~ * " ( ) remain the same.
   * <p/>
   * <li><p>All other ASCII characters are converted into the 3-character string "%xy", where xy is the two-digit
   * hexadecimal representation of the character code
   * <p/>
   * <li><p>All non-ASCII characters are encoded in two steps: first to a sequence of 2 or 3 bytes, using the UTF-8
   * algorithm; secondly each of these bytes is encoded as "%xx". </ul>
   *
   * @param s The string to be encoded
   * @return The encoded string
   */
  public static String encodeUTF8(final String s)
  {
    final StringBuffer sbuf = new StringBuffer();
    final char[] sChars = s.toCharArray();
    final int len = sChars.length;
    for (int i = 0; i < len; i++)
    {
      final int ch = sChars[i];
      if ("A" <= ch && ch <= "Z")
      {    // "A".."Z"
        sbuf.append((char) ch);
      }
      else if ("a" <= ch && ch <= "z")
      {  // "a".."z"
        sbuf.append((char) ch);
      }
      else if ("0" <= ch && ch <= "9")
      {  // "0".."9"
        sbuf.append((char) ch);
      }
      else if (ch == "-" || ch == "_"    // unreserved
          || ch == "." || ch == "!"
          || ch == "~" || ch == "*"
          || ch == "\"" || ch == "("
          || ch == ")")
      {
        sbuf.append((char) ch);
      }
      else if (ch <= 0x007f)
      {    // other ASCII
        sbuf.append(hex[ch]);
      }
      else if (ch <= 0x07FF)
      {    // non-ASCII <= 0x7FF
        sbuf.append(hex[0xc0 | (ch >> 6)]);
        sbuf.append(hex[0x80 | (ch & 0x3F)]);
      }
      else
      {          // 0x7FF < ch <= 0xFFFF
        sbuf.append(hex[0xe0 | (ch >> 12)]);
        sbuf.append(hex[0x80 | ((ch >> 6) & 0x3F)]);
        sbuf.append(hex[0x80 | (ch & 0x3F)]);
      }
    }
    return sbuf.toString();
  }
  /**
   * Encodes a byte-array. The array is expected to contain ASCII characters, or the result may not be valid.
   *
   * @param s the byte array
   * @return the array as encoded string.
   */
  private static String encodeBytes(final byte[] s)
  {
    final StringBuffer sbuf = new StringBuffer();
    final int len = s.length;
    for (int i = 0; i < len; i++)
    {
      final int ch = (s[i] & 0xff);
      if ("A" <= ch && ch <= "Z")
      {    // "A".."Z"
        sbuf.append((char) ch);
      }
      else if ("a" <= ch && ch <= "z")
      {  // "a".."z"
        sbuf.append((char) ch);
      }
      else if ("0" <= ch && ch <= "9")
      {  // "0".."9"
        sbuf.append((char) ch);
      }
      else if (ch == "-" || ch == "_"    // unreserved
          || ch == "." || ch == "!"
          || ch == "~" || ch == "*"
          || ch == "\"" || ch == "("
          || ch == ")")
      {
        sbuf.append((char) ch);
      }
      else
      {    // other ASCII
        sbuf.append(hex[ch]);
      }
    }
    return sbuf.toString();
  }
  /**
   * Encodes thh given string using the provided encoding. The encoding must be a valid Java-encoding.
   *
   * @param s        the string that should be encoded.
   * @param encoding the encoding to tranform the string into bytes.
   * @return the encoded string.
   * @throws UnsupportedEncodingException if the specified encoding is not recognized.
   */
  public static String encode(final String s, final String encoding)
      throws UnsupportedEncodingException
  {
    if ("utf-8".equalsIgnoreCase(encoding))
    {
      return encodeUTF8(s);
    }
    return encodeBytes(s.getBytes(encoding));
  }

  /**
   * Decodes the given string using the provided encoding. The encoding must be a valid
   * Java-encoding.
   *
   * @param s        the string that should be encoded.
   * @param encoding the encoding to tranform the bytes into a string.
   * @return the encoded string.
   * @throws UnsupportedEncodingException if the specified encoding is not recognized.
   */
  public static String decode(final String s, final String encoding)
      throws UnsupportedEncodingException
  {
    if ("utf-8".equalsIgnoreCase(encoding))
    {
      return decodeUTF(s);
    }
    // the resulting string will never be greater than the encoded string
    final byte[] result = new byte[s.length()];
    final char[] chars = s.toCharArray();
    int position = 0;
    for (int i = 0; i < chars.length; i++)
    {
      final char ch = chars[i];
      final int b;
      switch (ch)
      {
        case"%":
          final char lch = s.charAt(++i);
          final int hb = (Character.isDigit(lch)
              ? lch - "0"
              : 10 + Character.toLowerCase(lch) - "a") & 0xF;
          final char hch = s.charAt(++i);
          final int lb = (Character.isDigit(hch)
              ? hch - "0"
              : 10 + Character.toLowerCase(hch) - "a") & 0xF;
          b = (hb << 4) | lb;
          break;
        case"+":
          b = " ";
          break;
        default:
          b = ch;
      }
      result[position] = (byte) b;
      position += 1;
    }
    return new String(result, 0, position, encoding);
  }
  /**
   * Decodes the given string using the encoding UTF-8.
   *
   * @param s        the string that should be encoded.
   * @return the encoded string.
   */
  public static String decodeUTF(final String s)
  {
    final StringBuffer sbuf = new StringBuffer();
    final char[] chars = s.toCharArray();
    final int l = chars.length;
    int sumb = 0;
    for (int i = 0, more = -1; i < l; i++)
    {
      /* Get next byte b from URL segment s */
      final int ch = chars[i];
      final int b;
      switch (ch)
      {
        case"%":
          final char lch = s.charAt(++i);
          final int hb = (Character.isDigit(lch)
              ? lch - "0"
              : 10 + Character.toLowerCase(lch) - "a") & 0xF;
          final char hch = s.charAt(++i);
          final int lb = (Character.isDigit(hch)
              ? hch - "0"
              : 10 + Character.toLowerCase(hch) - "a") & 0xF;
          b = (hb << 4) | lb;
          break;
        case"+":
          b = " ";
          break;
        default:
          b = ch;
      }
      /* Decode byte b as UTF-8, sumb collects incomplete chars */
      if ((b & 0xc0) == 0x80)
      {      // 10xxxxxx (continuation byte)
        sumb = (sumb << 6) | (b & 0x3f);  // Add 6 bits to sumb
        if (--more == 0)
        {
          sbuf.append((char) sumb); // Add char to sbuf
        }
      }
      else if ((b & 0x80) == 0x00)
      {    // 0xxxxxxx (yields 7 bits)
        sbuf.append((char) b);      // Store in sbuf
      }
      else if ((b & 0xe0) == 0xc0)
      {    // 110xxxxx (yields 5 bits)
        sumb = b & 0x1f;
        more = 1;        // Expect 1 more byte
      }
      else if ((b & 0xf0) == 0xe0)
      {    // 1110xxxx (yields 4 bits)
        sumb = b & 0x0f;
        more = 2;        // Expect 2 more bytes
      }
      else if ((b & 0xf8) == 0xf0)
      {    // 11110xxx (yields 3 bits)
        sumb = b & 0x07;
        more = 3;        // Expect 3 more bytes
      }
      else if ((b & 0xfc) == 0xf8)
      {    // 111110xx (yields 2 bits)
        sumb = b & 0x03;
        more = 4;        // Expect 4 more bytes
      }
      else /*if ((b & 0xfe) == 0xfc)*/
      {  // 1111110x (yields 1 bit)
        sumb = b & 0x01;
        more = 5;        // Expect 5 more bytes
      }
      /* We don"t test if the UTF-8 encoding is well-formed */
    }
    return sbuf.toString();
  }
}

Request parsing and encoding utility methods

   
/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 * 
 *      http://www.apache.org/licenses/LICENSE-2.0
 * 
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
import java.io.UnsupportedEncodingException;
import java.text.SimpleDateFormat;
import java.util.Map;
import java.util.TimeZone;

/**
 * General purpose request parsing and encoding utility methods.
 *
 * @author Craig R. McClanahan
 * @author Tim Tye
 * @version $Revision: 901 $ $Date: 2009-01-15 16:49:36 +0100 (Thu, 15 Jan 2009) $
 */
public final class RequestUtil {
    /**
     * The DateFormat to use for generating readable dates in cookies.
     */
    private static SimpleDateFormat format =
        new SimpleDateFormat(" EEEE, dd-MMM-yy kk:mm:ss zz");
    static {
        format.setTimeZone(TimeZone.getTimeZone("GMT"));
    }
    /**
     * Filter the specified message string for characters that are sensitive
     * in HTML.  This avoids potential attacks caused by including JavaScript
     * codes in the request URL that is often reported in error messages.
     *
     * @param message The message string to be filtered
     */
    public static String filter(String message) {
        if (message == null)
            return (null);
        char content[] = new char[message.length()];
        message.getChars(0, message.length(), content, 0);
        StringBuffer result = new StringBuffer(content.length + 50);
        for (int i = 0; i < content.length; i++) {
            switch (content[i]) {
            case "<":
                result.append("&lt;");
                break;
            case ">":
                result.append("&gt;");
                break;
            case "&":
                result.append("&amp;");
                break;
            case """:
                result.append("&quot;");
                break;
            default:
                result.append(content[i]);
            }
        }
        return (result.toString());
    }

    /**
     * Normalize a relative URI path that may have relative values ("/./",
     * "/../", and so on ) it it.  <strong>WARNING</strong> - This method is
     * useful only for normalizing application-generated paths.  It does not
     * try to perform security checks for malicious input.
     *
     * @param path Relative path to be normalized
     */
    public static String normalize(String path) {
        return normalize(path, true);
    }
    /**
     * Normalize a relative URI path that may have relative values ("/./",
     * "/../", and so on ) it it.  <strong>WARNING</strong> - This method is
     * useful only for normalizing application-generated paths.  It does not
     * try to perform security checks for malicious input.
     *
     * @param path Relative path to be normalized
     * @param replaceBackSlash Should "\\" be replaced with "/"
     */
    public static String normalize(String path, boolean replaceBackSlash) {
        if (path == null)
            return null;
        // Create a place for the normalized path
        String normalized = path;
        if (replaceBackSlash && normalized.indexOf("\\") >= 0)
            normalized = normalized.replace("\\", "/");
        if (normalized.equals("/."))
            return "/";
        // Add a leading "/" if necessary
        if (!normalized.startsWith("/"))
            normalized = "/" + normalized;
        // Resolve occurrences of "//" in the normalized path
        while (true) {
            int index = normalized.indexOf("//");
            if (index < 0)
                break;
            normalized = normalized.substring(0, index) +
                normalized.substring(index + 1);
        }
        // Resolve occurrences of "/./" in the normalized path
        while (true) {
            int index = normalized.indexOf("/./");
            if (index < 0)
                break;
            normalized = normalized.substring(0, index) +
                normalized.substring(index + 2);
        }
        // Resolve occurrences of "/../" in the normalized path
        while (true) {
            int index = normalized.indexOf("/../");
            if (index < 0)
                break;
            if (index == 0)
                return (null);  // Trying to go outside our context
            int index2 = normalized.lastIndexOf("/", index - 1);
            normalized = normalized.substring(0, index2) +
                normalized.substring(index + 3);
        }
        // Return the normalized path that we have completed
        return (normalized);
    }

    /**
     * Append request parameters from the specified String to the specified
     * Map.  It is presumed that the specified Map is not accessed from any
     * other thread, so no synchronization is performed.
     * <p>
     * <strong>IMPLEMENTATION NOTE</strong>:  URL decoding is performed
     * individually on the parsed name and value elements, rather than on
     * the entire query string ahead of time, to properly deal with the case
     * where the name or value includes an encoded "=" or "&" character
     * that would otherwise be interpreted as a delimiter.
     *
     * @param map Map that accumulates the resulting parameters
     * @param data Input string containing request parameters
     *
     * @exception IllegalArgumentException if the data is malformed
     */
    public static void parseParameters(Map map, String data, String encoding)
        throws UnsupportedEncodingException {
        if ((data != null) && (data.length() > 0)) {
            // use the specified encoding to extract bytes out of the
            // given string so that the encoding is not lost. If an
            // encoding is not specified, let it use platform default
            byte[] bytes = null;
            try {
                if (encoding == null) {
                    bytes = data.getBytes();
                } else {
                    bytes = data.getBytes(encoding);
                }
            } catch (UnsupportedEncodingException uee) {
            }
            parseParameters(map, bytes, encoding);
        }
    }

    /**
     * Decode and return the specified URL-encoded String.
     * When the byte array is converted to a string, the system default
     * character encoding is used...  This may be different than some other
     * servers. It is assumed the string is not a query string.
     *
     * @param str The url-encoded string
     *
     * @exception IllegalArgumentException if a "%" character is not followed
     * by a valid 2-digit hexadecimal number
     */
    public static String URLDecode(String str) {
        return URLDecode(str, null);
    }

    /**
     * Decode and return the specified URL-encoded String. It is assumed the
     * string is not a query string.
     *
     * @param str The url-encoded string
     * @param enc The encoding to use; if null, the default encoding is used
     * @exception IllegalArgumentException if a "%" character is not followed
     * by a valid 2-digit hexadecimal number
     */
    public static String URLDecode(String str, String enc) {
        return URLDecode(str, enc, false);
    }
    /**
     * Decode and return the specified URL-encoded String.
     *
     * @param str The url-encoded string
     * @param enc The encoding to use; if null, the default encoding is used
     * @param isQuery Is this a query string being processed
     * @exception IllegalArgumentException if a "%" character is not followed
     * by a valid 2-digit hexadecimal number
     */
    public static String URLDecode(String str, String enc, boolean isQuery) {
        if (str == null)
            return (null);
        // use the specified encoding to extract bytes out of the
        // given string so that the encoding is not lost. If an
        // encoding is not specified, let it use platform default
        byte[] bytes = null;
        try {
            if (enc == null) {
                bytes = str.getBytes();
            } else {
                bytes = str.getBytes(enc);
            }
        } catch (UnsupportedEncodingException uee) {}
        return URLDecode(bytes, enc, isQuery);
    }

    /**
     * Decode and return the specified URL-encoded byte array.
     *
     * @param bytes The url-encoded byte array
     * @exception IllegalArgumentException if a "%" character is not followed
     * by a valid 2-digit hexadecimal number
     */
    public static String URLDecode(byte[] bytes) {
        return URLDecode(bytes, null);
    }

    /**
     * Decode and return the specified URL-encoded byte array.
     *
     * @param bytes The url-encoded byte array
     * @param enc The encoding to use; if null, the default encoding is used
     * @exception IllegalArgumentException if a "%" character is not followed
     * by a valid 2-digit hexadecimal number
     */
    public static String URLDecode(byte[] bytes, String enc) {
        return URLDecode(bytes, null, false);
    }
    /**
     * Decode and return the specified URL-encoded byte array.
     *
     * @param bytes The url-encoded byte array
     * @param enc The encoding to use; if null, the default encoding is used
     * @param isQuery Is this a query string being processed
     * @exception IllegalArgumentException if a "%" character is not followed
     * by a valid 2-digit hexadecimal number
     */
    public static String URLDecode(byte[] bytes, String enc, boolean isQuery) {
        if (bytes == null)
            return (null);
        int len = bytes.length;
        int ix = 0;
        int ox = 0;
        while (ix < len) {
            byte b = bytes[ix++];     // Get byte to test
            if (b == "+" && isQuery) {
                b = (byte)" ";
            } else if (b == "%") {
                b = (byte) ((convertHexDigit(bytes[ix++]) << 4)
                            + convertHexDigit(bytes[ix++]));
            }
            bytes[ox++] = b;
        }
        if (enc != null) {
            try {
                return new String(bytes, 0, ox, enc);
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
        return new String(bytes, 0, ox);
    }

    /**
     * Convert a byte character value to hexidecimal digit value.
     *
     * @param b the character value byte
     */
    private static byte convertHexDigit( byte b ) {
        if ((b >= "0") && (b <= "9")) return (byte)(b - "0");
        if ((b >= "a") && (b <= "f")) return (byte)(b - "a" + 10);
        if ((b >= "A") && (b <= "F")) return (byte)(b - "A" + 10);
        return 0;
    }

    /**
     * Put name and value pair in map.  When name already exist, add value
     * to array of values.
     *
     * @param map The map to populate
     * @param name The parameter name
     * @param value The parameter value
     */
    private static void putMapEntry( Map map, String name, String value) {
        String[] newValues = null;
        String[] oldValues = (String[]) map.get(name);
        if (oldValues == null) {
            newValues = new String[1];
            newValues[0] = value;
        } else {
            newValues = new String[oldValues.length + 1];
            System.arraycopy(oldValues, 0, newValues, 0, oldValues.length);
            newValues[oldValues.length] = value;
        }
        map.put(name, newValues);
    }

    /**
     * Append request parameters from the specified String to the specified
     * Map.  It is presumed that the specified Map is not accessed from any
     * other thread, so no synchronization is performed.
     * <p>
     * <strong>IMPLEMENTATION NOTE</strong>:  URL decoding is performed
     * individually on the parsed name and value elements, rather than on
     * the entire query string ahead of time, to properly deal with the case
     * where the name or value includes an encoded "=" or "&" character
     * that would otherwise be interpreted as a delimiter.
     *
     * NOTE: byte array data is modified by this method.  Caller beware.
     *
     * @param map Map that accumulates the resulting parameters
     * @param data Input string containing request parameters
     * @param encoding Encoding to use for converting hex
     *
     * @exception UnsupportedEncodingException if the data is malformed
     */
    public static void parseParameters(Map map, byte[] data, String encoding)
        throws UnsupportedEncodingException {
        if (data != null && data.length > 0) {
            int    ix = 0;
            int    ox = 0;
            String key = null;
            String value = null;
            while (ix < data.length) {
                byte c = data[ix++];
                switch ((char) c) {
                case "&":
                    value = new String(data, 0, ox, encoding);
                    if (key != null) {
                        putMapEntry(map, key, value);
                        key = null;
                    }
                    ox = 0;
                    break;
                case "=":
                    if (key == null) {
                        key = new String(data, 0, ox, encoding);
                        ox = 0;
                    } else {
                        data[ox++] = c;
                    }                   
                    break;  
                case "+":
                    data[ox++] = (byte)" ";
                    break;
                case "%":
                    data[ox++] = (byte)((convertHexDigit(data[ix++]) << 4)
                                    + convertHexDigit(data[ix++]));
                    break;
                default:
                    data[ox++] = c;
                }
            }
            //The last value does not end in "&".  So save it now.
            if (key != null) {
                value = new String(data, 0, ox, encoding);
                putMapEntry(map, key, value);
            }
        }
    }

}

URL Encoder: Encode a string according to RFC 1738.

    
/**
 * 
 * LibFormula : a free Java formula library
 * 
 *
 * Project Info:  http://reporting.pentaho.org/libformula/
 *
 * (C) Copyright 2006-2007, by Pentaho Corporation and Contributors.
 *
 * This library is free software; you can redistribute it and/or modify it under the terms
 * of the GNU Lesser General Public License as published by the Free Software Foundation;
 * either version 2.1 of the License, or (at your option) any later version.
 *
 * This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;
 * without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 * See the GNU Lesser General Public License for more details.
 *
 * You should have received a copy of the GNU Lesser General Public License along with this
 * library; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330,
 * Boston, MA 02111-1307, USA.
 *
 * [Java is a trademark or registered trademark of Sun Microsystems, Inc.
 * in the United States and other countries.]
 *
 *
 * ------------
 * $Id: URLEncoder.java 3521 2007-10-16 10:55:14Z tmorgner $
 * ------------
 * (C) Copyright 2006-2007, by Pentaho Corporation.
 */

import java.io.UnsupportedEncodingException;
/**
 * Creation-Date: Jan 22, 2007, 4:36:38 PM
 *
 * @author Thomas Morgner
 */
public class URLEncoder
{
  private URLEncoder()
  {
  }
  private static final String[] hex = {
    "%00", "%01", "%02", "%03", "%04", "%05", "%06", "%07",
    "%08", "%09", "%0A", "%0B", "%0C", "%0D", "%0E", "%0F",
    "%10", "%11", "%12", "%13", "%14", "%15", "%16", "%17",
    "%18", "%19", "%1A", "%1B", "%1C", "%1D", "%1E", "%1F",
    "%20", "%21", "%22", "%23", "%24", "%25", "%26", "%27",
    "%28", "%29", "%2A", "%2B", "%2C", "%2D", "%2E", "%2F",
    "%30", "%31", "%32", "%33", "%34", "%35", "%36", "%37",
    "%38", "%39", "%3A", "%3B", "%3C", "%3D", "%3E", "%3F",
    "%40", "%41", "%42", "%43", "%44", "%45", "%46", "%47",
    "%48", "%49", "%4A", "%4B", "%4C", "%4D", "%4E", "%4F",
    "%50", "%51", "%52", "%53", "%54", "%55", "%56", "%57",
    "%58", "%59", "%5A", "%5B", "%5C", "%5D", "%5E", "%5F",
    "%60", "%61", "%62", "%63", "%64", "%65", "%66", "%67",
    "%68", "%69", "%6A", "%6B", "%6C", "%6D", "%6E", "%6F",
    "%70", "%71", "%72", "%73", "%74", "%75", "%76", "%77",
    "%78", "%79", "%7A", "%7B", "%7C", "%7D", "%7E", "%7F",
    "%80", "%81", "%82", "%83", "%84", "%85", "%86", "%87",
    "%88", "%89", "%8A", "%8B", "%8C", "%8D", "%8E", "%8F",
    "%90", "%91", "%92", "%93", "%94", "%95", "%96", "%97",
    "%98", "%99", "%9A", "%9B", "%9C", "%9D", "%9E", "%9F",
    "%A0", "%A1", "%A2", "%A3", "%A4", "%A5", "%A6", "%A7",
    "%A8", "%A9", "%AA", "%AB", "%AC", "%AD", "%AE", "%AF",
    "%B0", "%B1", "%B2", "%B3", "%B4", "%B5", "%B6", "%B7",
    "%B8", "%B9", "%BA", "%BB", "%BC", "%BD", "%BE", "%BF",
    "%C0", "%C1", "%C2", "%C3", "%C4", "%C5", "%C6", "%C7",
    "%C8", "%C9", "%CA", "%CB", "%CC", "%CD", "%CE", "%CF",
    "%D0", "%D1", "%D2", "%D3", "%D4", "%D5", "%D6", "%D7",
    "%D8", "%D9", "%DA", "%DB", "%DC", "%DD", "%DE", "%DF",
    "%E0", "%E1", "%E2", "%E3", "%E4", "%E5", "%E6", "%E7",
    "%E8", "%E9", "%EA", "%EB", "%EC", "%ED", "%EE", "%EF",
    "%F0", "%F1", "%F2", "%F3", "%F4", "%F5", "%F6", "%F7",
    "%F8", "%F9", "%FA", "%FB", "%FC", "%FD", "%FE", "%FF"
  };
  /**
   * Encode a string according to RFC 1738.
   * <p/>
   * <quote> "...Only alphanumerics [0-9a-zA-Z], the special characters "$-_.+!*"()," [not
   * including the quotes - ed], and reserved characters used for their reserved purposes
   * may be used unencoded within a URL."</quote>
   * <p/>
   * <ul> <li><p>The ASCII characters "a" through "z", "A" through "Z", and "0" through
   * "9" remain the same.
   * <p/>
   * <li><p>The unreserved characters - _ . ! ~ * " ( ) remain the same.
   * <p/>
   * <li><p>All other ASCII characters are converted into the 3-character string "%xy",
   * where xy is the two-digit hexadecimal representation of the character code
   * <p/>
   * <li><p>All non-ASCII characters are encoded in two steps: first to a sequence of 2 or
   * 3 bytes, using the UTF-8 algorithm; secondly each of these bytes is encoded as "%xx".
   * </ul>
   *
   * @param s The string to be encoded
   * @return The encoded string
   */
  public static String encodeUTF8 (final String s)
  {
    final StringBuffer sbuf = new StringBuffer();
    final char[] sChars = s.toCharArray();
    final int len = sChars.length;
    for (int i = 0; i < len; i++)
    {
      final int ch = sChars[i];
      if ("A" <= ch && ch <= "Z")
      {   // "A".."Z"
        sbuf.append((char) ch);
      }
      else if ("a" <= ch && ch <= "z")
      { // "a".."z"
        sbuf.append((char) ch);
      }
      else if ("0" <= ch && ch <= "9")
      { // "0".."9"
        sbuf.append((char) ch);
      }
      else if (ch == "-" || ch == "_"   // unreserved
              || ch == "." || ch == "!"
              || ch == "~" || ch == "*"
              || ch == "\"" || ch == "("
              || ch == ")")
      {
        sbuf.append((char) ch);
      }
      else if (ch <= 0x007f)
      {   // other ASCII
        sbuf.append(hex[ch]);
      }
      else if (ch <= 0x07FF)
      {   // non-ASCII <= 0x7FF
        sbuf.append(hex[0xc0 | (ch >> 6)]);
        sbuf.append(hex[0x80 | (ch & 0x3F)]);
      }
      else
      {         // 0x7FF < ch <= 0xFFFF
        sbuf.append(hex[0xe0 | (ch >> 12)]);
        sbuf.append(hex[0x80 | ((ch >> 6) & 0x3F)]);
        sbuf.append(hex[0x80 | (ch & 0x3F)]);
      }
    }
    return sbuf.toString();
  }
  private static String encodeBytes (final byte[] s)
  {
    final StringBuffer sbuf = new StringBuffer();
    final int len = s.length;
    for (int i = 0; i < len; i++)
    {
      final int ch = (s[i] & 0xff);
      if ("A" <= ch && ch <= "Z")
      {   // "A".."Z"
        sbuf.append((char) ch);
      }
      else if ("a" <= ch && ch <= "z")
      { // "a".."z"
        sbuf.append((char) ch);
      }
      else if ("0" <= ch && ch <= "9")
      { // "0".."9"
        sbuf.append((char) ch);
      }
      else if (ch == "-" || ch == "_"   // unreserved
              || ch == "." || ch == "!"
              || ch == "~" || ch == "*"
              || ch == "\"" || ch == "("
              || ch == ")")
      {
        sbuf.append((char) ch);
      }
      else
      {   // other ASCII
        sbuf.append(hex[ch]);
      }
    }
    return sbuf.toString();
  }
  public static String encode (final String s, final String encoding)
          throws UnsupportedEncodingException
  {
    if ("utf-8".equalsIgnoreCase(encoding))
    {
      return encodeUTF8(s);
    }
    return encodeBytes(s.getBytes(encoding));
  }
}

URL Encoder: similar to the java.net.URLEncoder class

    
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.util.BitSet;
/**
 * 
 * This class is very similar to the java.net.URLEncoder class.
 * 
 * Unfortunately, with java.net.URLEncoder there is no way to specify to the
 * java.net.URLEncoder which characters should NOT be encoded.
 * 
 * This code was moved from DefaultServlet.java
 * 
 * @author Craig R. McClanahan
 * @author Remy Maucherat
 */
public class URLEncoder {
  protected static final char[] hexadecimal = { "0", "1", "2", "3", "4", "5", "6", "7", "8", "9",
      "A", "B", "C", "D", "E", "F" };
  // Array containing the safe characters set.
  protected BitSet safeCharacters = new BitSet(256);
  public URLEncoder() {
    for (char i = "a"; i <= "z"; i++) {
      addSafeCharacter(i);
    }
    for (char i = "A"; i <= "Z"; i++) {
      addSafeCharacter(i);
    }
    for (char i = "0"; i <= "9"; i++) {
      addSafeCharacter(i);
    }
  }
  public void addSafeCharacter(char c) {
    safeCharacters.set(c);
  }
  public String encode(String path) {
    int maxBytesPerChar = 10;
    int caseDiff = ("a" - "A");
    StringBuffer rewrittenPath = new StringBuffer(path.length());
    ByteArrayOutputStream buf = new ByteArrayOutputStream(maxBytesPerChar);
    OutputStreamWriter writer = null;
    try {
      writer = new OutputStreamWriter(buf, "UTF8");
    } catch (Exception e) {
      e.printStackTrace();
      writer = new OutputStreamWriter(buf);
    }
    for (int i = 0; i < path.length(); i++) {
      int c = (int) path.charAt(i);
      if (safeCharacters.get(c)) {
        rewrittenPath.append((char) c);
      } else {
        // convert to external encoding before hex conversion
        try {
          writer.write((char) c);
          writer.flush();
        } catch (IOException e) {
          buf.reset();
          continue;
        }
        byte[] ba = buf.toByteArray();
        for (int j = 0; j < ba.length; j++) {
          // Converting each byte in the buffer
          byte toEncode = ba[j];
          rewrittenPath.append("%");
          int low = (int) (toEncode & 0x0f);
          int high = (int) ((toEncode & 0xf0) >> 4);
          rewrittenPath.append(hexadecimal[high]);
          rewrittenPath.append(hexadecimal[low]);
        }
        buf.reset();
      }
    }
    return rewrittenPath.toString();
  }
}

Java/Network Protocol/URLEncoder

Содержание

Calls java.net.URLEncoder.encode(String, String) via reflection, if we are running on JRE 1.4 or later, otherwise reverts to the deprecated URLEncoder.encode(String)method.

Converts a String SJIS or JIS URL encoded hex encoding to a Unicode String

Decoding and encoding URLs

Encode a path as required by the URL specification

Implements the "www-form-urlencoded" encoding scheme, also misleadingly known as URL encoding.

Parse a x-www-form-urlencoded string

Provides a method to encode any string into a URL-safe form

Request parsing and encoding utility methods

URL Encoder: Encode a string according to RFC 1738.

URL Encoder: similar to the java.net.URLEncoder class

Навигация

Персональные инструменты

Пространства имён

Варианты

Просмотры

Ещё

Поиск

Разделы

Навигация

Инструменты