20 Replies Latest reply on Aug 18, 2014 10:37 AM by Rob D

    File size of versions in EPDM

    Pete Yodis

      Can anyone tell me if versions of files in EPDM are full sized?  Meaning if I have a file that is 1MB in size and I version it 5 times, will there be 5MB worth of storage occupied by all 5 versions?

        • Re: File size of versions in EPDM
          Jeremy Feist

          yes, and no.

           

          if the only change in a file from from version 2 to version 3 is in the metadata, the system will will make the newer metadata point to the existing file to be the new version.

           

          also, there is a "cold storage" utility you can set up to store older versions in a compressed and off-line location.

           

          edit: also, each version of the file would be the size that it was at that version. files will grow and shrink as you edit them.

          • Re: File size of versions in EPDM
            Adrian Velazquez

            I would assume that each version would be the size that corresponds to the amount of data it stores.

              • Re: File size of versions in EPDM
                Pete Yodis

                Jeremy,  I thought I had seen that similar answer elsewhere.  Thanks for the information.

                 

                A related question then is when do you place your files in the vault?  We are primarily a product oriented company (OEM).  We have a pretty good amount of variations in early design/concepting.  That relates to alot of files that are not needed later.  In our WorkGroup vault we have do not place files in the vault until we want to release them.  With EPDM, the encouraged scheme seems to be to work in the vault earlier.  I am wondering about the ramifications of vault size growth with an extreme amount of versions.  Our current WorkGroup vault is 115 GB and we now have an EPDM server ready to roll with over 600GB of storage (I believe).  If we want to create a fully electronic sign-off process that creates a version to the document as it rolls through the checking and release steps, then that would add to the pile as well.  I know there are cold storage options as well... but I am just trying to wrap my head around how we want to operate as far as when to vault files in the first place, how the workflow would affect number versions, and what that means to vault size.  How do others operate in this manner?

                  • Re: File size of versions in EPDM
                    Wayne Matus

                    Another option to save disk space on archive server is to compress the older versions of files. Check out the Administration Guide for instructions on setting it up.

                    • Re: File size of versions in EPDM
                      Charley Saint

                      Pete,

                       

                      The only way I can think of to reasonably maintain files outside of the vault is if you have a prototyping revision scheme that you don't care too much about and check it in when you are switching schemes. That way it doesn't care if the rev was X01 or X27 because it's going to A no matter what. I would still recommend against it.

                       

                      Generally actions that version files would be checking in a modified file, checking in a file with modified references, modification of the datacard, or send a file through a transition that sets a variable. If you modify a variable that has an attribute attached to the filetype (like Revision where it's set to map to the custom property Revision) then you will create a new file on the archive server. The only time you don't create a new file is if you modify variables that don't have attributes mapped to the file.

                       

                      Also EPDM 2014 is supported on Windows Server 2012 which has an awesome new feature for data deduplication that I am certain will be put fully to the test soon by a great deal of us. So keep an eye out for that.

                        • Re: File size of versions in EPDM
                          Pete Yodis

                          Charley,

                           

                               Thanks for the information.

                           

                          On the first paragraph.  Could you elaborate on why you would not recomend putting files into the vault later in the scheme you mentioned?

                           

                          On the 2nd paragraph... We desire (right now) to set up work flows to capture the drafter, checker, and then approver that are variables mapped to document attributes.  Drafter checks in at Work In Process... version is created.  Drafter finished and submits for checking.  Engineer checks and approves...  Another is version is created, as engineer's initials are captured and saved into the file.  The file now shows engineer checked data.  The file moves to approval and is approved by manager with this initials written to the document.... another version is created and the rev level incremented for a released document.  There would be at least 3 versions before a release (and possibly more if its kicked around a couple of times).  The versions of a file could really add up quickly and the size of the vault could grow quickly, especially if larger data sets (large assembly drawings come to mind) are versioned many times.

                           

                          On the third paragraph. We have Windows server 2008 R2 and EPDM 2013 installed.  Can we easily upgrade the server OS to 2012 version (I am not in IT) when we go to EPDM 2014?  What does deduplication do for us?

                            • Re: File size of versions in EPDM
                              Charley Saint

                              Pete,

                               

                              The main pitfall for working on files outside of the vault then moving them in is referential integrity. There's a right and a wrong way to move SolidWorks files and it's different for inside the vault and outside and allowing multiple users to do this is just asking for trouble.

                               

                              Usually IT will want to format a server and install fresh rather than take an upgrade path. This isn't too bad as usually they are just formatting the C drive where the archives almost certainly are not. Make sure you talk to your VAR about what is needed to restore your server after this happens before attempting, but it's not a huge ordeal. Deduplication is a new feature that scans volumns for duplicate blocks of data and "merges" them together. In many cases this has been shown to triple the capacity of a volume, we're looking to ramp up some test here soon to see how well it works. You can find out more here:

                              http://blogs.technet.com/b/filecab/archive/2012/05/21/introduction-to-data-deduplication-in-windows-server-2012.aspx

                                • Re: File size of versions in EPDM
                                  Pete Yodis

                                  Charley,

                                   

                                    Couldn't SolidWorks Explorer be used to move the files into local cache before they are checked.  This would ensure all files are in local cache before check-in.  If there are already files in the vault... I think they would have to get copied out to local cache to overwrite the versions in local cache before check-in.  I'm trying understand the nuts and bolts of this.  We discussed this in training with our reseller, that we would want to work in local cache all the time and check-in to make a version available to co-workers.  I'm just concerned about the immense number of versions this could create.  I think the answer lies in that EPDM wants to have the references in local cache back to the vault.  Working with files outside the local cache creates references to files outside the vault.  When you move files into the local cache, you need to make sure references are made to files in the vault.  Is that correct?  If it is, there doesn't seem to be a tool to handle that.  In demo's they always show how easy it is to copy whole folders of data into the vault, but when you get into it they want you to not really do that...

                                   

                                  Pete

                                    • Re: File size of versions in EPDM
                                      Charley Saint

                                      Pete,

                                       

                                      You absolutely CAN do all that, but you're opening yourself up to risks that can be mitigated by simply always working in the vault. All the methods are there to allow you to move files into the vault, but you can just work in the local cache the entire time and save yourself so much trouble. Even if your entire company only spends 20 hours/year tracking down and repairing broken references, and waiting for files copy from one place to another, that's more than the cost of an additional 600G 15K drive. Disk space is cheap and the archive server is easily scalable, I'd say move into the vault and live there for ever more.

                                        • Re: File size of versions in EPDM
                                          Pete Yodis

                                          Charley and Jim,

                                           

                                            Is it possible to create a state that is versionless?  Meaning... while working in "concept" phase (lets call it), there are no versions saved - there is just the latest copy.  Each check-in from local cache would be more akin to a save on a drive, or an overwrite of whats aleady in the vault.  This would eliminate a lot of versions early on and encourage users to begin working in the vault right away.

                                            • Re: File size of versions in EPDM
                                              Pete Yodis

                                              I have started another thread for this aspect for a versionless state....

                                               

                                              https://forum.solidworks.com/thread/73420

                                              • Re: File size of versions in EPDM
                                                Jim Sculley

                                                Every check in will create a new version.  There is no way to prevent this.  Users shouldn't need encouragement to use the vault because from their perspective, not much will be different. They save a file 'in the vault' which looks just like saving it in a directory.   If they do this from SolidWorks, they will be shown a data card (this behavior is configurable) the first time they save that they can fill out with information.

                                                 

                                                Periodically check it in.  How frequently a user does this can be a matter of company policy or simple user preference.  Users don't see all the other versions when doing this.  When checking a file in, you also have the option of 'keeping it checked out' which means the file gets copied to the server, but the user can continue working on it.

                                                 

                                                If you are having a hard time selling the idea of check-ins, ask your users what they do when they start a design and decide part of the way through that they want to try a different approach.  What do they do?  Make a copy of the file(s)?  Use configurations?  Change the original and try to keep track of changes in case they want to go back?  EPDM makes this trivial.  Check it in with the 'keep checked out' option selected and use a check in comment that makes it clear that this was the version right before you decided to try a different design.  Make all the changes you want.  Save as many times as you want.  Check it in if you want.  If you like the new design, check it in one last time.  If you don't like it and want to throw it out, perform a 'Get Version' command and get the version that was checked in before all the changes.  The original will be copied back from the server and you are back to where you started.

                                                 

                                                Jim S.

                                            • Re: File size of versions in EPDM
                                              Jim Sculley

                                              I would not recommend working inside and outside the vault.  It is a nightmare to manage and will only lead to problems.  You seem to be overly concerned about the version thing.  In the past three years we have put 527,000 files in our vault.  They are taking up 500GB of our 1.18TB RAID array.  We are currently keeping all versions and do not use compression.  Some of these files have 100+ versions.

                                               

                                              Storage is cheap.  Your time is not.  If you choose to mix work inside and outside the vault, you will regret it.  I am still dealing with file reference problems caused by the fact that we couldn't switch all our SW users over to EPDM at the same time.  It is far too easy to 'poison' your assemblies in the vault with files from outside the vault due to the ridiculous search algorithm SW uses to locate parts.

                                               

                                              You are a bit confused about the local cache as well.  The local cache isn't a folder you are supposed to drop things into.  It is something that is meant to exist behind the scenes.  There is no reason to work in non-vault folders and then 'move' files into the vault.  You can achieve the same thing by simply working in the vault and not checking anything in.  But you better have every machine backing up all its files every day.

                                               

                                              If you generate a lot of unneeded files early in your design cycle, there is nothing preventing you from destroying them later on.  We work that way here.  We work from a project folder where eveything is designed and detailed.  Once we have it the way we want, we release the files.  A custom add-in renames and moves all the released models/drawings to give them part numbers.  When everything is released, we can go through the project folder and clean out anything that isn't wanted/needed.  To my knowledge, no one has actually cleaned out anything to date, so all that extra, leftover stuff is included in the 527000 files in the vault.

                                                • Re: File size of versions in EPDM
                                                  Pete Yodis

                                                  Jim and Charley,

                                                   

                                                     Thanks for your responses.

                                                   

                                                  Jim, its good to see how many files are in your vault, how many versions they have, and how much storage they are occupying.

                                                   

                                                  I think, I am a little confused about the local cache.  You say the local cache is something that is meant to exist behind the scenes.  Can you elaborate?

                                                   

                                                  I am used to WorkGroup as we have used it for 10 years here.  With WorkGroup, I believe references are re-written upon check-in and check-out.  When you check in a bunch of files into WorkGroup, it re-writes the references between the files upon check-in to only include files in the vault.  When you take a copy out, all the references to all the files coming out are written to be the folder where you are taking the copies to.  As I understand WorkGroup, for files in the vault there are no references to files outside the vault.  It sounds as if files in EPDM can have references to files outside the vaulted folders.  I think this is the reason you guys  (and my reseller) are stressing to work inside the vault all the time.  I am trying to get this "right in my bones" (as a former boss used to say), as I will be the point person explaining to everyone here exactly why we have to work in a different way.  This will be the hardest change for us here, I think.  The culture change of having to do something different, immediately when we go live with EPDM.

                                                   

                                                  How do you two backup your local cache for your users?  Is the local cache located on the hard drive of the local machines?  I've been thinking we would work this way, and then create a hidden folder on the network for each user and use Microsoft Sync Toy to keep the hidden network folder automatically synced to the local cache.  In case the hard drive fails, we have the latest synced folder on the network to re-create the local cache on a repaired machine.

                                                    • Re: File size of versions in EPDM
                                                      Jim Sculley

                                                      Pete Yodis wrote:

                                                       

                                                      Jim and Charley,

                                                       

                                                         Thanks for your responses.

                                                       

                                                      Jim, its good to see how many files are in your vault, how many versions they have, and how much storage they are occupying.

                                                       

                                                      I think, I am a little confused about the local cache.  You say the local cache is something that is meant to exist behind the scenes.  Can you elaborate?

                                                      The local cache should be thought of as a concept, not a physical thing (even though it is an actual directory somewhere).  If you have a brand new machine that you just set up and connected to the vault, you can browse in to the vault and see listings of all the files that exist in the vault as though they were there on your machine, but your local cache is still completely empty.  When you preview, open or check out a file, a copy must be transferred from the vault to your local machine.  When you open it, modify it or save it, you are working with this local copy, stored in your local cache.  You can't browse to some directory called 'Local Cache' and see these files.  They appear as every other file does, in some folder on your hard drive.  When you check a file in, a copy is transferred to the vault, creating a new version.

                                                       

                                                      I am used to WorkGroup as we have used it for 10 years here.  With WorkGroup, I believe references are re-written upon check-in and check-out.  When you check in a bunch of files into WorkGroup, it re-writes the references between the files upon check-in to only include files in the vault.  When you take a copy out, all the references to all the files coming out are written to be the folder where you are taking the copies to.  As I understand WorkGroup, for files in the vault there are no references to files outside the vault.  It sounds as if files in EPDM can have references to files outside the vaulted folders.  I think this is the reason you guys  (and my reseller) are stressing to work inside the vault all the time. 

                                                      Like with everything else in EPDM, check in and check out are very fine grained operations.  There is a setting to prevent you from checking in files that are outside the vault, and it is enabled by default.  However, due to SolidWorks incredibly naive algorithm for determining if it has found the correct file reference, it is incredibly easy to have an assembly that references a file outside the vault.  Here is a simple example:

                                                       

                                                      1.  Copy a part file (call it A.sldprt) from outside the vault into the vault.

                                                      2.  Make a new assembly (call it B.sldasm), and add A.sldprt (the copy inside the vault) to it as a component

                                                      3.  Save the assembly in the vault and check it in.

                                                      4.  Close everything.

                                                      5.  Open up the copy of A.sldprt that is outside the vault

                                                      6.  Check out B.sldasm and open it.  The SolidWorks search algorithm will consider the already open A.sldprt (outside the vault) to be the correct A.sldprt, even though the assembly was last saved pointing to A.sldprt inside the vault.  If you have never looked into how SW searches for files, it is worth reading if you don't want to sleep well at night.

                                                      7.  Save B.sldasm

                                                      8.  Check in B.sldasm and voila!  You now have an assembly inside the vault pointing at files outside the vault. 

                                                       

                                                      There is no warning or indication that you have done so other than a fairly innocuous looking icon in the EPDM task pane (a blue dot with a white slash through it).  Nothing in the check in windows itself indicates a problem.

                                                       

                                                      If you work inside and outside the vault and file names are the same in both places, this will happen.  I guarantee it. 

                                                      I am trying to get this "right in my bones" (as a former boss used to say), as I will be the point person explaining to everyone here exactly why we have to work in a different way.  This will be the hardest change for us here, I think.  The culture change of having to do something different, immediately when we go live with EPDM.

                                                       

                                                       

                                                      How do you two backup your local cache for your users?  Is the local cache located on the hard drive of the local machines?  I've been thinking we would work this way, and then create a hidden folder on the network for each user and use Microsoft Sync Toy to keep the hidden network folder automatically synced to the local cache.  In case the hard drive fails, we have the latest synced folder on the network to re-create the local cache on a repaired machine.

                                                      We don't back up the local cache.  Users are expected to check their work in periodically.  If they don't it is at their peril. No one else will clean up the mess they make.  If you don't put the local cache on the local hard drive, you are missing out on one of the great benefits of EPDM.  Very little network traffic for normal work.  The files are local.  They open and close at the speed of your hard drive (solid state drives in our case).  This is a huge imporvement over working from a network drive, even when ours was local on a dedicated Gigabit network.

                                                       

                                                      If you start synching local caches to a network folder, you will quickly have multiple partial copies of large portions of the vault. Wasteful in my opinion. I just did a quick search of our vault.  There are 270 files currently checked out and most of them are checked out by my admin user for processing.  No other user has more than about 12 files checked out.  Everything else is safe and secure on the archive server, getting backed up every night.

                                                       

                                                      Jim S.

                                                       

                                                      JIm S.

                                                        • Re: File size of versions in EPDM
                                                          Pete Yodis

                                                          Jim,

                                                           

                                                          Thanks for the response.  I think I am thinking about everything clearly.  I totally get SolidWorks file referencing mechanism and understand how you can get yourself tied up in your underwear without even knowing it.  To refine your statement about the local cache in EPDM...  I think we could say it is a physical thing and a concept at the same time.  The concept portion is being able to see that files are in the vault, but not having them local to work on.

                                                           

                                                          A major difference between WorkGroup and EPDM seems to be reference handling.  With WorkGroup, we never needed to worry about the scenario you described.  I understand that scenario perfectly as do our users when working outside the vault.  With WorkGroup, if you checked in an assembly with it referencing a part outside the vault - it didn't care.  The next time someone went to copy out the assembly, the reference from the vault would come with it and that user would never see the file that was outside the vault when that assembly was checked in.  (This could be good and bad).  I guess we could say that references in the WorkGroup vault are static and don't point outside the vault and references in EPDM are dynamic and can point outside the vault.

                                            • Re: File size of versions in EPDM
                                              Rob D

                                              As noted elsewhere, if you change just the metadata, the file is not saved again, the version just points to the same file.

                                               

                                              So, from initial creation to release, even if it goes through 10 checks, it's only one file (unless it gets changed). The metadata changes for each version, but that's it.

                                               

                                              -Rob