• Advertisement
Sign in to follow this  

[java] Java Strings

This topic is 3006 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I tried some quick searches, but I want to know why the Java String is so odd. Correct me if I am wrong, but I think it makes an entirely new string anytime you modify it, which makes String handling extremely time and memory consuming. They have great classes like StringBuffer which seem to fix this.. so my question is, why did they make the String like this as opposed to like the StringBuffer class?

Share this post


Link to post
Share on other sites
Advertisement
Immutable objects can be shared by many clients without having to worry about side effects. C# does the exact same thing.

Share this post


Link to post
Share on other sites
String are non mutable so you can give out private strings without accidentally overwriting a private member.

EDIT:

beaten to it but yea, to keep private members private you either have to have them non mutable or clone them before you give them out, usually the later is more wasteful of resources

EDIT2:

On a side note the b2Vec2 class in Box2DFlashAS3 totally fails at this,

var vec:b2Vec2 = _body.GetLinearVelocity();
vec.x *= .3;



will actually change the velocity of the physics body

Share this post


Link to post
Share on other sites
Quote:
Original post by Crazyfool
...which makes String handling extremely time and memory consuming...


Have you actually benchmarked an application to determine if its extremely time or memory consuming?

There is a bunch of information available on the internet about the advantages and disadvantages of mutable and immutable objects. And there are also a bunch of benchmarks on when its more of an advantage to concat immutable strings or to append a mutable string.

Share this post


Link to post
Share on other sites
Quote:
Original post by kryat
Quote:
Original post by Crazyfool
...which makes String handling extremely time and memory consuming...


Have you actually benchmarked an application to determine if its extremely time or memory consuming?

There is a bunch of information available on the internet about the advantages and disadvantages of mutable and immutable objects. And there are also a bunch of benchmarks on when its more of an advantage to concat immutable strings or to append a mutable string.


Benchmark? Not really. But I have ran into heap issues (without increasing heapsize) until optimizing, including using stringbuffers when I can. I also found a performance increase in time, but that could have been the memory issue. As a side point - I am not an expert with Java. I only know what I know from experience working on school assignments, and this was just a recent issue that came up for me.

Share this post


Link to post
Share on other sites
If you are so concerned about performance, maybe you should use StringBuilder instead of StringBuffer.

Share this post


Link to post
Share on other sites
I didnt know StringBuilder was better/different, thanks! As I previously mentioned, it was for school (which I completed the assignment already), so performance isnt top priority, however when it throws an out of memory exception, its kind of hard to demonstrate correctness :) I also have to give a demo in person, and the only portable computer is a netbook with a tiny processor and a tiny amount of RAM, so it cant be too slow.

Share this post


Link to post
Share on other sites
Quote:
Original post by DevFred
Immutable objects can be shared by many clients without having to worry about side effects. C# does the exact same thing.
I admit I'm not yet sold on this rationale. I can see the benefits for sure, but shouldn't this be the goal of const or final?

I also find Java Strings a bit odd. Especially because they have been granted the right to have their own operators, resulting in different behaviour from the standard copy-by-reference one would expect when calling operator= on them.
Sure, operator+ comes extremely handy, yet strings are a big language exception to me...

Share this post


Link to post
Share on other sites
Quote:
Original post by Krohm
different behaviour from the standard copy-by-reference one would expect when calling operator= on them.

I don't understand what you are talking about here. Could you elaborate? Maybe with a few lines of code that demonstrates what you would expect (and which does not happen)?

Share this post


Link to post
Share on other sites
Quote:
Original post by Krohm
Quote:
Original post by DevFred
Immutable objects can be shared by many clients without having to worry about side effects. C# does the exact same thing.
I admit I'm not yet sold on this rationale. I can see the benefits for sure, but shouldn't this be the goal of const or final?


No. The idea is that when dealing with multi-threaded applications, side effects are a very bad thing. You don't want a function to modify data that is being used by two threads at once, because this can lead to dangerous application states and access violations.

So by making strings immutable, they are inherently thread safe, and you can easily pass them around however you want without having to carefully manage which thread has possession of which string.

Const and Final only affect literal strings, and cannot be constructed dynamically in the application.

Share this post


Link to post
Share on other sites
Having String mutable would actually be a huge performance/security problem. Consider having a Person class with a string property called 'name' and int property 'age'. Now your application wants to display a string to the user that has the name and the person and his age. For example "Bill, age 24".

So the application gets the property of the Person and appends age to it. All fine and dandy. But next time it does that, the string is "Bill, age 24, age 24". What is actually happening is that the name of the Person gets modified every time by accident. You could prevent accidents like that by returning a new String object every time the application requests the name property of the Person. But then the method to retrieve the property would become costly, because every time it is called, a new object has to be created.

As you meant, you can use StringBuffer or StringBuilder for string handling, so it is not memory or time consuming. It is also easy to create those from Strings. I don't see a problem here.

I found an article on this, seems to explain this a lot better than I do :) http://macchiato.com/columns/Durable2.html

Share this post


Link to post
Share on other sites
Java seems rather flawed in this respect (among others, but they are for another topic).

Considering everything is essentially an implicit reference (except for primitive data types, which seems to be another arbitrary flaw), I'd expect that modifying a returned String would modify the original, as well. I'd also expect that final makes a variable unmodifiable, as happens with primitive types - however, final only makes the reference itself unmodifiable, so you can't reassign it to a new object. Therefore, Java has no const-correctness and is forced to simulate it by making arbitrary object types immutable through return-a-copy on seemingly mutable methods.

Share this post


Link to post
Share on other sites
This approach to string handling is one of few things Java got just right. At least compared to mess that are numeric types.

Aside from threading, string manipulation is an annoying topic. Languages can do well to treat strings in this way.

An important thing to keep in mind here. Java strings are 'words'. They are not char *, collection of chars, an array of values - they are conceptual strings, a text, and should be treated as such.

As for performance - experience shows that in typical string processing, straight-forward approach using immutable strings and garbage collection is order of magnitude faster than C++ methods. Real world practice shows that string manipulations tend to discard data almost immediately after use in just about all cases.

Obviously, there is a whole class of in-place algorithms that can offer vastly superior performance, but they also introduce systematic flaws which hamper productivity and increase cost of development far beyond what is reasonable. Typical example is the require size of buffer. There are effectively no use cases left anymore where predetermined size will fit. Point in case: buffer overflows - still unsolved, still here.

In addition, string operations on immutable strings make all operations streaming in nature. Either use existing instance, or process left to right into new one. This makes copies cheap (but still more expensive than in-place).

C or C++ are blown clear out of water by managed languages when it comes to string handling. Adequately large application will, without any effort spent on optimization, outperform equivalent C or C++ application. This was shown long ago for, IIRC, Tex. Specialized mallocs and local allocation optimization as well as a lot of effort is needed to outperform it.

Another string handling concept that tends to creep up from time to time are ropes. In practice however, these fall into same category as tries or binary insertion sort. Elegant, flexible, but with constant factor cost that makes them slower by a constant factor in just about any real world scenario.

Immutable types are very much in line with functional programming concepts, which is why learning a functional language is always beneficial.

For trivial cases, these differences are irrelevant, with exception of a few very specialized domains.


Edit: HashMap has special optimizations which benefit from immutability. When used with strings (or other immutable objects), it pays to reuse *same* instance for key. This results in lightning fast lookups, which may be counter-intuitive compared to map in C++. Other structures can benefit from this as well, there are also many optimizations hidden in standard library, and it could have something to do with intern() as well (it's been a while).

For extensive lookups, I even went as far as to use a helper set to hold unique instances loaded at runtime, since it offered such huge benefit.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement