Welcome to the Invelos forums. Please read the forum rules before posting.

Read access to our public forums is open to everyone. To post messages, a free registration is required.

If you have an Invelos account, sign in to post.

    Invelos Forums->DVD Profiler: Contribution Discussion Page: 1... 4 5 6 7 8 ...15  Previous   Next
Credit Name Parsing
Author Message
DVD Profiler Unlimited RegistrantStar ContributorAddicted2DVD
Registered: March 13, 2007
Reputation: Highest Rating
United States Posts: 17,334
Posted:
PM this userEmail this userView this user's DVD collectionDirect link to this postReply with quote
Quoting Taro:
Quote:
Quoting Addicted2DVD:
Quote:
Yes... I believe that is what the end non-contributing user wants... but even Ken said that he also has to think of ease of use for contributors as well.

Actually, as a contributor I'd prefer a system that asks me when contributing: "please select which actor this is or create a new one" rather than the current, to me very discouraging, system. My local is very well kept but so different in the mean time from what the online requires (warts and errors and all), that I simply can't bring myself to adapt my local, contribute and readapt back to a correctly linking system for my local.

I think there's a group of users that would say: "gah, even more work for contributions", whereas other contributors conversely will get a new boost, knowing that what they do now is all the more useful than in the older system.


I think some of you is misunderstanding what I am saying. I agree with a system something like that. What I disagree with  is being forced to look into every name in the cast and crew. I believe there should be an option for those you don't know to just submit credited as only.

Maybe have one of the options that comes up along with the names... an option to say I don't know Submit as credited only.

This way a person don't feel they have to do all that research if they don't feel like it. Or if they feel the research they did don't bring up as strong of results as they prefer they can select that option to still contribute the info without putting in guess work.
Pete
DVD Profiler Unlimited RegistrantStar ContributorAddicted2DVD
Registered: March 13, 2007
Reputation: Highest Rating
United States Posts: 17,334
Posted:
PM this userEmail this userView this user's DVD collectionDirect link to this postReply with quote
Quoting m.cellophane:
Quote:
Quoting DarklyNoon:
Quote:
2. Only one cast and Crew list for a Film, no more DVD based lists. The very few exceptions that we will need to live with are outweighed by the amazing simplicity we would gain.

I think this is a good idea. Also, we wouldn't need to necessarily lose the alternate data, if it exists. If a film has different credits in different regions or different formats, we could have alternate cast and crew lists available.


While I agree if possible this would be a good thing (especially if alternate lists can be done). Going by his request in the original post I think it may be beyond what he is willing to do at this point. If this is the case I hope Ken will say so. This way we can concentrate more on what he is actually planning instead of concentrating on something that isn't going to happen yet.
Pete
DVD Profiler Unlimited RegistrantStar ContributorNexus the Sixth
Contributor since 2002
Registered: March 13, 2007
Reputation: High Rating
Sweden Posts: 3,197
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
I've been thinking some more about this parsing business... Since some people want a single name field, some want two names and some want to keep the current system, why don't we just add a fourth name field that would be a single, independent name field? That way everyone could get what they want and we don't lose all the parsing work that has already been done. The fourth field would be the default entered and displayed name for any newly created profile, but if they already exist, could be automatically populated from the current three name fields. Or would this be too complex to implement?
First registered: February 15, 2002
 Last edited: by Nexus the Sixth
DVD Profiler Unlimited RegistrantStar ContributorAddicted2DVD
Registered: March 13, 2007
Reputation: Highest Rating
United States Posts: 17,334
Posted:
PM this userEmail this userView this user's DVD collectionDirect link to this postReply with quote
Not sure I understand what is gotten from this? I mean how does it help matters at all? I mean parsing still would have to be done for the original 3 name fields.
Pete
DVD Profiler Unlimited RegistrantStar ContributorVirusPil
uncredited
Registered: January 1, 2009
Reputation: Highest Rating
Germany Posts: 3,087
Posted:
PM this userDirect link to this postReply with quote
I try to throw in my ideas.

First I fully agree we need a unique ID for the persons (all cast and crew member) in database to get rid of the BY as separator. Every person would get such a unique Id even if there's a name variant or not.
With the Id all name variants get stored together.
All persons are stored online in a person database.
In DVDP a local name (every user can enter freely) is shown.
If nothing entered the variant with the most credits is shown. (Which needs that this information is also added to the person) Or if this is too much programming/not possible, simply the first entered variant is the shown one, when not different entered.
It should be always possible to enter a new person and mark it as local only. Local only persons will be never uploaded.
The search field in the edit profile dialogue and the filter section should search within all variants and show me the persons which have them included. The variants should be possible to be seen. (maybe a pop-up if I point the cursor at the entry)

Let's start with the way which would be the possibility which needs the most space on the users hard disc (maybe slow) and which needs the most work before contributing:
1.) The whole person database gets downloaded to the local database once a day to be up to date. New entries in the persons database need to be done online. New entries need a note for which profile the entry is. (UPC/localiity/Title/Original title) The entries get forwarded by screeners or voters, which pay attention there is just one new entry for a new person and not two times the same.
Of course at entering new entries online existing persons should also show up (same as local) for those that might add a new person without checking.
-> The negative aspects can be seen easily in this idea. But there are also some positive aspects.

...  Seeing I'm too slow in typing so I have to go on with next points later/tomorrow. (My wife is waiting)
 Last edited: by VirusPil
DVD Profiler Unlimited RegistrantStar ContributorAddicted2DVD
Registered: March 13, 2007
Reputation: Highest Rating
United States Posts: 17,334
Posted:
PM this userEmail this userView this user's DVD collectionDirect link to this postReply with quote
Quoting VirusPil:
Quote:

...
It should be always possible to enter a new person and mark it as local only. Local only persons will be never uploaded.
...


I am not sure I understand the purpose of what I quoted. If they are in the credits then they should be uploaded. If you mean for uncredited actors... we already have this by just not checking the box when uploading. I don't see that changing anyway... not sure why it would.
Pete
DVD Profiler Unlimited RegistrantStar ContributorAce_of_Sevens
Registered: December 10, 2007
Reputation: High Rating
Posts: 3,004
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
Names + numbers would solve a lot of problems, but not the issue of changing every name in the DB when a common name changes. In fact, it may create additional problems of having to renumber people when common names changed. However, if combined with a per-film crediting system instead of per name, it would at least make a big improvement. Next time Robin Wright (Penn)'s name changed, we'd only have to update 40 entries, not 592.

It woudl work like so. You would enter most of the DVD as now. Part of the information woudl be what titles it contained. You would then look this up online, which would be indentified by original title, production year and an additional field for clarification if needed in case there are multiple versions with different credits, or two movies of the same name in the same year. TV shows could be entered as original show title: season #: episode #: Episode Name in the title field. This information would auto-populate original title, genres, cast and crew. It coudl auto-separate when more than one title was on the disc. When we made changes, we'd be changing the film DB, not the DVD and the update system woudl just need to check for film changes as well as DVD. The initial conversion coudl be messy, but once up and running, it would save everyone a lot of trouble, including screeners as there wouldn't be nearly as many updates to check.
DVD Profiler Unlimited RegistrantStar ContributorAddicted2DVD
Registered: March 13, 2007
Reputation: Highest Rating
United States Posts: 17,334
Posted:
PM this userEmail this userView this user's DVD collectionDirect link to this postReply with quote
One of us must be confused... as the way I understood it with the id's you wouldn't have to worry about name changes any longer. From what I thought they were saying You use the ID for the person... an ID that always stays the same and for each ID you get a list of names that person uses. So they all link. So when a new name comes up you add to that list.

But what that wouldn't solve is when 2 people use the same name in their list. I believe you would still need something similar to birth years for when more then 1 person uses the same name.
Pete
DVD Profiler Unlimited RegistrantStar ContributorVirusPil
uncredited
Registered: January 1, 2009
Reputation: Highest Rating
Germany Posts: 3,087
Posted:
PM this userDirect link to this postReply with quote
Quoting Addicted2DVD:
Quote:
Quoting VirusPil:
Quote:

...
It should be always possible to enter a new person and mark it as local only. Local only persons will be never uploaded.
...


I am not sure I understand the purpose of what I quoted. If they are in the credits then they should be uploaded. If you mean for uncredited actors... we already have this by just not checking the box when uploading. I don't see that changing anyway... not sure why it would.


Simple answer: for those users tracking something completely different. ("Wineprofiler")
DVD Profiler Unlimited RegistrantMark Harrison
I like IMDB
Registered: March 13, 2007
Reputation: Great Rating
United States Posts: 3,321
Posted:
PM this userEmail this userVisit this user's homepageView this user's DVD collectionDirect link to this postReply with quote
I've picked up a couple things reading this thread.

Quoting GSyren:
Quote:
I know a single name field would break current functionality, but personally that's something that I would happily sacrifice. If we can't decide proper parsing (e g Kristin Scott / Thomas vs Kristin / Scott Thomas), then sorting by last name is crippled anyway. But that's just me...


Why do we need to decide proper parsing?  If both variations are linked together, you've just pleased 99% of the users.  The other 1% are always free to cleanup the bad name if it bugs them.

I've also been following the discussion (mainly from Pete) about how he just wants to type what he sees on the screen and perhaps link a few of the bigger names.  I like this as well and suggest that perhaps these actors have a field to indicate if they've been linked or not.  Those who care could then easily identify which cast / crew need some additional research and which have already been taken care of.
Get the CSVExport and Database Query plug-ins here.
Create fake parent profiles to organize your collection.
DVD Profiler Unlimited RegistrantStar ContributorAddicted2DVD
Registered: March 13, 2007
Reputation: Highest Rating
United States Posts: 17,334
Posted:
PM this userEmail this userView this user's DVD collectionDirect link to this postReply with quote
Quoting VirusPil:
Quote:
Quoting Addicted2DVD:
Quote:
Quoting VirusPil:
Quote:

...
It should be always possible to enter a new person and mark it as local only. Local only persons will be never uploaded.
...


I am not sure I understand the purpose of what I quoted. If they are in the credits then they should be uploaded. If you mean for uncredited actors... we already have this by just not checking the box when uploading. I don't see that changing anyway... not sure why it would.


Simple answer: for those users tracking something completely different. ("Wineprofiler")


Still not sure if that is needed... as in those cases you wouldn't be submitting it anyway.  But there may be something there I am just not understanding... wouldn't be the first time. 
Pete
DVD Profiler Unlimited RegistrantBlair
Resistance is Futile!
Registered: October 30, 2008
United States Posts: 1,249
Posted:
PM this userDirect link to this postReply with quote
Ok, I know that I am repeating both myself as well as others, but it's easier for me to just say it all at once rather than tailor my post against what was said.


1) Cast/Crew Unique IDs sounds great, but it is no more helpful than the current system if there isn't a way on the fly to identify individuals against the current ID.

Pick a name that is common -- John Doe for instance -- and look through the online database at how often that names shows up. Ok, first we will have to drift away from the old system into the new one. So I resubmit the data found in my local DB from a single movie that I own where all cast had already been stored in the online DB (same as any other update in the current system). How then would that initial link between the actor and a unique ID be established? If it's done automatically, then there is no linking. You can't assume every John Doe is the same; you can't assume that a new one added is the same as one already in the system. So, before John Doe (not to mention every other person) can be entered into the system as connected to an DVD, the unique ID already needs to be established in the system for the name to have something to latch onto otherwise it is given a unique ID, the next submission of the same actor (particularly if the submissions are going simultaneously) will get a different unique ID, and so on.... and next thing you know we have a linking system that is no better (and probably even worse) than what has been complained about for years.


2) Film Unique IDs creation for me is even more important than changing the cast/crew system because it can be a major work-load reducer. Currently, a film's data was entered and then copied to several UPCs including those in other languages. If a change is made, to be helpful people then have to go through every copy in every language to make that same change. This type of system has flaws of its own in it's most basic form, but it also has practical applications.


3) ID Merging would be another necessity. Looking at this from a cast ID system only (no Film IDs), I am entering a brand new film not yet in the database. Assume for the moment every other cast entry and UPC/Disc ID entry in the database has been fully cleaned up and integrated; the cast system is currently as perfect as it could get. From a local-versus-online DB standpoint, there are still three types of cast groups: a) Cast members in my local DB that were pulled from the online DB because I own and updated the films. These are bridged between local and online DBs by the Unique ID just like we wanted; b) Cast that I am adding which are new to my local DB (no other film was ever added or updated from online DB) but exist in the online DB. Locally these have no IDs (or at least none that are yet synced to the online DB); c) Brand new actors that I am adding that are not yet in the online database either.

Even if there is a way to check online what ID a cast member uses prior to submission, you will still have many cast members added as "new" to the online BD (with brand new IDs) because in the current system, all that is required is entering a name, although parsing in order to find the best name which in turn allows for adding as "Credited As" adds to the 'correctness' of the online DB. In the new system you'll just have members accepting that "this John Doe is different from the others" whether by lack or research or though slip-ups.

There will need ot be a way to later say "OK, this guy in this film, under this DVD/Disc ID with this cast ember ID is the same as this other guy with the same name but has a different ID; someone simply messed up; now we need ot merge to fix the problem."



If #1 and #3 options are taken as the forefront of the new system, would it really think that this will be an improvement over the system that we currently have? It sounds to me like the exact same problem as we are dealing with now, just done in a different way as well as some possible additional problems.
If at first you don't succeed, skydiving isn't for you.

He who MUST get the last word in on a pointless, endless argument doesn't win. It makes him the bigger jerk.
 Last edited: by Blair
DVD Profiler Desktop and Mobile RegistrantStar ContributorTaro
Registered: February 23, 2009
Reputation: High Rating
Belgium Posts: 1,580
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
Quoting Addicted2DVD:
Quote:
I think some of you is misunderstanding what I am saying. I agree with a system something like that. What I disagree with  is being forced to look into every name in the cast and crew. I believe there should be an option for those you don't know to just submit credited as only.

Maybe have one of the options that comes up along with the names... an option to say I don't know Submit as credited only.

This way a person don't feel they have to do all that research if they don't feel like it. Or if they feel the research they did don't bring up as strong of results as they prefer they can select that option to still contribute the info without putting in guess work.

Sorry, yeah, I misunderstood you. I can certainly see how it would be a problem if a contributor isn't certain of a cast member if it's person A, B or C or if they don't want to go through too much hassle. Hey, how about this idea then. Example of a cast contribution:

Robert Downey
Tom Cruise
John Doe

There are multiple Robert Downey's already in the online > the contribution system shows next to that name the corresponding ID's or the option to create a new ID. Clicking an ID shows in a pop up a list of the profiles that ID is listed in, a bit like the current CLT. It would look a bit like this:

      O                            O                          O
ID 000000            ID 213466196        Create new ID

So, the contributor opt for the easy way out and click new ID, or can take the time to check the other ID's, see if it matches with the actor in question. Or, if it's a third new one, click the create new ID as well. Lateron, other contributors (a bit like the common name thing) can go back to that profile and say, for example: Actor ID 000000 is the same as ID 3354949 (with documentation) and the two are merged after approval


Tom Cruise then, there is only one, so the system offers only two choices: existing ID or create new


John Doe is entirely new, so there the question isn't even asked: new ID is created automatically. However, the user has the possibility to click on the name, search the DB manually for an alternate name and link the two. But default is create new ID.

That's the gist of it. I don't know if it's feasible programming wise, but would that alleviate some of the reservations? In worst case, a contributor can just click create new ID for all and be done with it. Others can link the two lateron and the advantage is that it only needs to be done once and not for every single profile seperately.

Only possible problem I see is if all contributors just click 'create new ID' and we'll have tons of actors with multiple ID's who are actually the same.
Blu-ray collection
DVD collection
My Games
My Trophies
DVD Profiler Unlimited RegistrantStar ContributorVirusPil
uncredited
Registered: January 1, 2009
Reputation: Highest Rating
Germany Posts: 3,087
Posted:
PM this userDirect link to this postReply with quote
Quoting Addicted2DVD:
Quote:
Quoting VirusPil:
Quote:
Quoting Addicted2DVD:
Quote:
Quoting VirusPil:
Quote:

...
It should be always possible to enter a new person and mark it as local only. Local only persons will be never uploaded.
...


I am not sure I understand the purpose of what I quoted. If they are in the credits then they should be uploaded. If you mean for uncredited actors... we already have this by just not checking the box when uploading. I don't see that changing anyway... not sure why it would.


Simple answer: for those users tracking something completely different. ("Wineprofiler")


Still not sure if that is needed... as in those cases you wouldn't be submitting it anyway.  But there may be something there I am just not understanding... wouldn't be the first time. 


Would be needed just at my first idea (1.)), because to have a new person in the database it has to be added online. So you would lose the possibility to track also for example Laser Disc, VHS, ... which have actors in it which are not in the database.
Of course on the next idea which add the person at submitting it wouldn't be needed.
DVD Profiler Desktop and Mobile RegistrantStar ContributorDJ Doena
Registered: May 1, 2002
Registered: March 14, 2007
Reputation: Highest Rating
Germany Posts: 6,745
Posted:
PM this userEmail this userVisit this user's homepageView this user's DVD collectionDirect link to this postReply with quote
Quoting DJ Doena:
Quote:
I could imagine the following system the accomodate both sides:

You can enter the credits as credited, straight-forward, text-only. No birth year. Just a one-, two- or three-name field. When entered as such these people don't link to anyone. They are just text. If you click on Bruce Willis in Armageddon it won't show you Bruce Willis in Die Hard
You can contribute that and that's it then.

But you can do this too: You can say that this actor entry in that profile is linked to an actor entry in the online actor database (OADB). This online actor entry is very compact. The actor's most credited name* and maybe a birth year (and a gender) to make it easier to spot if we're talking about the same person.
Then this entry in your profile becomes more: You have a "credited as" AND an ID. And if you have other profiles where a guy with the same ID appears, they will link. Now Die Hard will show up when you click on Bruce in Armageddon.

* the most credited name is determined automatically by using all the "credited as" where that actor's ID was used.

What you need for this OADB is a search function where you can type in the name and it'll show you all the actors with that name (ideally considering all "credited as"ses) and then the profiles where that actor appears. If you don't find him/her, you can create a new actor ID which you then link with that entry in your local database.

This actor ID info is also contributable of course.

Hope that was somewhat understandable.


Quoting DJ Doena:
Quote:
Addendum to my previous post:

The transformation from the old to the new linking system could simply be done by creating an ID for every unique actor in the current database, basing it on the distinction of what currently is considered a unique entry, e.g. ID1 = KevinSmith1970, ID2 = KevinSmith1963, ID3 = HelenaBonhamCarter, ID4 = RobertDowneyJr.


I hacked down a small program to illustrate my idea. It takes a collection.xml as input and generates an "Online Actor Database" (a text file ). It also outputs text files for every movie in that collection.xml. BUT: Both the OADB and the movie files contain a unique key for every actor. And yet the movie files contain the actual "credited as".

Take my sample collection.xml, transform it and look for Terence Hill / Mario Girotti. In Renegade he's credited as Terence Hill, in Winnetou II as Mario Girotti. And yet he as the same ActorID. The one that can also be found in the OADB text file.

http://doena-soft.de/dvdprofiler/3.6.0/LinkingSystemTransformer.zip

And the 512 lines of C# code that I coded this evening / copied from other DVDP projects of mine:

http://doena-soft.de/dvdprofiler/3.6.0/LinkingSystemTransformer_src.zip
Karsten
DVD Collectors Online

 Last edited: by DJ Doena
DVD Profiler Unlimited RegistrantStar ContributorVirusPil
uncredited
Registered: January 1, 2009
Reputation: Highest Rating
Germany Posts: 3,087
Posted:
PM this userDirect link to this postReply with quote
Quoting Addicted2DVD:
Quote:
One of us must be confused... as the way I understood it with the id's you wouldn't have to worry about name changes any longer. From what I thought they were saying You use the ID for the person... an ID that always stays the same and for each ID you get a list of names that person uses. So they all link. So when a new name comes up you add to that list.

But what that wouldn't solve is when 2 people use the same name in their list. I believe you would still need something similar to birth years for when more then 1 person uses the same name.



Wouldn't it be like that: you will always have the possibility to add a new one. For example:
In database: John Doe (ID 123456789) [aka John A. Doe]
If you add a new one, the second will be: John Doe (ID 987654321)
The ID instead the BY as separator and as collector for the variants.
    Invelos Forums->DVD Profiler: Contribution Discussion Page: 1... 4 5 6 7 8 ...15  Previous   Next